In this paper we write explicitly the open book decompositions of links of quotient surface singularities supporting the corresponding unique Milnor fillable contact structure. The page-genus of these Milnor open books are minimal among all Milnor open books supporting the same contact structure. We also investigate whether the Milnor genus is equal to the support genus for links of quotient surface singularities. We show that for many types of the quotient surface singularities the Milnor genus is equal to the support genus. In the remaining cases we are able to find a small upper bound for the support genus.
1.
Introduction
In the present work, we discuss discontinuous Galerkin (DG) approximations to a nonlinear optimal control problem (OCP) of ordinary differential equations (ODEs). More precisely, we consider the following optimal control problem:
subject to
Here u(t)∈Rm is the control, and x(t)∈Rd is the state of the system at time t∈[0,T]. Further, g:[0,T]×Rd×Rm→R and f:[0,T]×Rd×Rm→Rd are given, and the set of admissible controls Uad⊂U:=L∞(0,T;Rm) is given by
for some uℓ,uu∈Rm. Here the inequality is understood in the component-wise sense.
There have been a lot of study on the numerical computation for the above problem. The numerical schemes need a discretization of the ODEs, for example, the Euler discretization for the OCPs of ODEs are well studied for sufficiently smooth optimal controls based on strong second-order optimality conditions [2,13,14]. For optimal control problems with control appearing linearly, the optimal control may be discontinuous, for an instance, bang-bang controller, and such conditions may not be satisfied. In that respect, there have been many studies to develop new second-order optimality conditions for the optimal control problems with control appearing linearly [3,21,31,32]. The second-order Runge-Kutta approximations for the OPCs was studied in [15]. Recently, works [16,17] developed a novel stability technique to obtain new error estimates for the Euler discretization of OCPs.
The Pseudo-spectral method is also popularly used for the discretization due to its capability of high-order accuracy for smooth solutions to the OCPs [20,33]. However, the high-order accuracy of the Pseudo-spectral method is known to be often lost for bang-bang OCPs, where the solutions may not be smooth enough. To handle this issue, Henriques et al. [24] proposed a mesh refinement method based on a high-order DG method for the OCPs of ODEs. The DG method discretizes the time interval in small time subintervals, in which the weak formulation is employed. The test functions are usually taken as piecewise polynomials which can be discontinuous at boundaries of the time interval, see Section 2 for more detailed discussion. We refer to [7,19,34] and references therein for DG methods for ODEs. It is also worth to refer to papers for the analysis of the discretization of optimal control problems of PDEs, for example, the elliptic problems [1,23,35] and the parabolic problems [9,12,25,26,27,28,29]. In addition, the recent works [22,30] studied the discretization of the optimal control for fractional diffusion problems.
In this paper, we provide a rigorous analysis for the DG discretization applied to the nonlinear OCPs (1.1) and (1.2) with arbitrary order r∈N∪{0} for general functions f and g with suitable smoothness. Motivated from a recent work by Neitzel and Vexler [29], we impose the non-degeneracy condition (2.4) on an optimal control ˉu of the OCPs (1.1) and (1.2). We obtain the existence and convergence results for the semi-discretized case and the fully discretized case. The rates of the convergence results depend on the regularity of the optimal solution ˉu and its adjoint state with the degree of piecewise polynomials mentioned above, see Section 2 for details.
It is worth noticing that the control is not required to be linear in the state Eq (1.2), and the control space Uad allows to take into account discontinuous controls. The constraints for controls are defined by lower and upper bounds. Moreover, the cost functional is also given in a general form, not limited to be quadratic. We mention that the DG discretization of zeroth order was used in [29] for the optimal control problem for the semi-linear parabolic equation where the control is linearly applied to the system.
For notational simplicity, we denote by I:=(0,T), X:=L2(I;Rd), and (v,w)I=(v,w)L2(I;Rd). We also use simplified notations:
for 1≤p≤∞. Throughout this paper, for any compact set K⊂Rm, we assume that f,g∈C([0,T];W3,∞(Rd×K)) satisfy
for some M>0.
We next introduce the control-to-state mapping G:U→X∩L∞(I;Rd), G(u)=x, with x solving (1.2). It induces the cost functional j:U→R+, u↦J(u,G(u)). This makes the optimal control problems (1.1) and (1.2) equivalent to
Definition 1.1. A control ˉu∈Uad is a local solution of (1.4) if there exists a constant ε>0 such that j(u)≥j(ˉu) holds for all u∈Uad with ‖ˉu−u‖L2(I)≤ε.
In the proof of the existence and convergence results, the main task is to show that the strong convexity of j induced by the second-order optimality condition (2.4) is preserved near the optimal control ˉu and also for its DG discretized version jh. It is achieved using the second-order analysis in Section 4. As a preliminary, we also justify that j and jh are twice differentiable, by showing the differentiability of the control-to-state mapping G and its discretized version Gh in the appendix.
In Section 2, we explain the DG discretization of the ODEs and the OCP. Then we present the main results for the semi-discretized case and provide some preliminary results. In Section 3, the adjoint problems are studied. Section 4 is devoted to study the second order analysis of the cost functionals j and jh. In Section 5, we prove the existence of the local solution and obtain the convergence rate for the semi-discretized case. Section 6 is devoted to establish the existence and convergence results for the fully discretized case. Finally, in Section 7, we perform several numerical experiments for linear and nonlinear OCPs. In Appendix A, we obtain first and second order derivatives of the control-to-state mapping G. Appendix B is devoted to prove a Grönwall-type inequality for the discretization of the ODEs (1.2) involving the control variable. It is used in Appendix C to establish the differentiability of the discrete control-to-state mapping Gh and obtain the derivatives. In Appendix D, we prove Lemmas 3.3 and 3.5, which reformulate the first derivatives of the cost functionals in terms of the adjoint states. In Appendix E, we derive the formulas on the second order derivatives of the cost functionals.
2.
DG formulation
In this section, we describe the approximation of the OCPs (1.1) and (1.2) with the DG method, and then we state the main results on the semi-discrete case. First, we illustrate the discretization of the ordinary differential equations
where x:[0,T]→Rd, F:(0,T)×Rd→Rd is uniformly Lipschitz continuous with respect to x, i.e.,
with a constant L>0. By the Cauchy Lipschitz theorem, we have the existence and uniqueness of classical solution x of (2.1).
Given an integer N∈N, we consider a partition of I into N-intervals {In}Nn=1 given by In=(tn−1,tn) with nodes 0=:t0<t1<⋯<tN−1<tN:=T. Let hn be the length of In, i.e., hn=tn−tn−1, and we set h:=max1≤n≤Nhn. For a piecewise continuous function φ:[0,T]→Rd, we also define
The jumps across the nodes is denoted by [φ]n:=φ+n−φ−n for 1≤n≤N−1. For r∈N∪{0}, we define
where Pr(In) represents the set of all polynomials of t up to order r defined on In with coefficients in Rd. Then the DG approximate solution xh of (2.1) is given as
for all φ∈Xrh. Here (⋅,⋅) denotes the inner product in Rd, and
for integrable functions A,B:In→Rd.
We recall the error estimate for the DG approximation of (2.1) from [34,Corollary 3.15 & Theorem 2.6].
Theorem 2.1. Let x(t) be the solution of (2.1) such that x∈Wk,∞(I;Rd) for some k≥1. Suppose that hL<1. Then there exists a unique DG approximate solution xh∈Xrh to (2.2) of order r∈N∪{0}. Furthermore, we have
where C>0 is determined by L, T, and r.
Now, for given u∈U, we consider the approximate solution x∈Xrh of the control problem (1.2) satisfying
for all φ∈Xrh.
Throughout the paper, we will consider local solutions ˉu to (1.4) satisfying the following non-degeneracy condition.
Assumption 1. Let ˉu∈Uad be the local solution of (1.1). We assume that it satisfies
for some γ>0.
The differentiability of the cost functional j(u)=J(u,G(u)) with respect to u∈U is induced by the differentiability of the solution mapping G(u) justified in Appendix A (see also the proofs of Lemmas 3.3 and E.1). Note that the above second-order optimality condition holds under suitable regularity assumptions on the function f, g, and solutions, see Remark E.2 for a detailed discussion. We refer to [4,5] for further discussion on the second-order condition and also [8,10,11] for the optimal control problem of PDEs.
In addition, we assume that ˉu∈Uad has bounded total variation, i.e., V(ˉu)≤R/2 for a fixed value R>0. Here the total variation V(f) for f∈L∞(0,T) is defined as
where P is any partition P={0=x0<x1<x2<⋯<xn<xn+1=T}.
Considering a discrete control-to-state mapping Gh:U→Xrh, u↦Gh(u), where Gh(u) is the solution of (2.3), we introduce the discrete cost functional jh:U→R+,u↦J(u,Gh(u)). Let us consider the following discretized version of (1.1):
where
We now define the local solution to (2.5) as follows.
Definition 2.2. A control ˉuh∈Uad∩VR is called a local solution of (2.5) if there exists an δ>0 such that jh(u)≥jh(ˉuh) holds for all u∈Uad∩VR with ‖u−ˉuh‖L2(I)≤δ.
In the first main result, we prove the existence of the local solution to the approximate problem (2.5).
Theorem 2.3. Let ˉu∈Uad∩VR/2 be a local solution of (1.1) satisfying Assumption 1. Then, there are constants ϵ>0 and h0>0 such that for h∈(0,h0) the approximate problem (2.5) has a local solution ˉuh∈Uad∩VR satisfying ‖ˉuh−ˉu‖L2(I)<ε.
The second main result is the following convergence estimate of the approximate solutions.
Theorem 2.4. Let ˉu∈Uad∩VR/2 be a local solution of (1.4) satisfying Assumption 1, let ˉuh be the approximate solution found in Theorem 2.3, and let λ(ˉu) be the adjoint state defined in Definition 3.1 below. Assume that the state ˉx=G(ˉu) belongs to Wk1,∞(I;Rd) and the adjoint state λ(ˉu) belongs to Wk2,∞(I;Rd) for some k1,k2≥1. Then we have
The required regularity of solutions ˉx and λ(ˉu) can be obtained under suitable smoothness assumptions on f, g, and ˉu, see Remark 3.2 below. The above result establishes the error estimate concerning the discretization of the ODEs in the OCPs. We will give the proofs of Theorems 2.3 and 2.4 in Section 5. On the other hand, to implement a numerical computation to the OCP (1.4), one needs also to consider an approximation of the control space with a finite dimensional space. In Section 6, we will see that the proof of Theorem 2.4 can be extended to the error analysis incorporating the discretization of the control space.
3.
Adjoint states
This section is devoted to study the adjoint states to the OCP (1.1) and its discretized version (2.5).
We introduce a bilinear form b(⋅,⋅) for x∈W1,∞(0,T) and φ∈X by
Then, for a fixed control u∈U and initial data x0∈Rd, a weak formulation of (1.2) can be written as
for all φ∈X with x(0)=x0.
Definition 3.1. For a control u∈U, we define the adjoint state λ=λ(u)∈W1,∞(0,T) as the solution to
with λ(T)=0. It satisfies the weak formulation
for all φ∈X with λ(T)=0.
Remark 3.2. It follows from the Eqs (1.2) and (3.3) that if
we have
For u,v∈U, the derivative of j at u in the direction v is defined by
It is well-known that the derivative of the cost functional can be calculated with the adjoint state, as described below.
Lemma 3.3. We have
for all v∈Uad, where x=G(u).
Proof. For the completeness of the paper, we give the proof in Appendix D.
Next we describe the adjoint problem for the approximate problem (2.5). For x,φ∈Xrh, we define
For approximate solution xh=Gh(u)∈Xrh, the Eq (2.3) with control u∈U can be written as
Now we define the adjoint equation for the approximate problem (2.5).
Definition 3.4. The adjoint state λh=λh(u)∈Xrh is defined as the solution of the following discrete adjoint equation:
In Appendix D, we briefly explain how the adjoint Eq (3.8) can be derived from the Lagrangian related to (2.5). We also have an analogous result to Lemma 3.3.
Lemma 3.5. We have
where xh=Gh(u).
Proof. The proof is given in Appendix D.
In order to prove the main results in Section 2, we shall use the following lemma.
Lemma 3.6. Let u∈U. Suppose that x=G(u)∈Wk1,∞(I;Rd) and λ=λ(u)∈Wk2,∞(I;Rd) for some k1,k2≥1. Then we have
Proof. We recall from (3.4) and (3.8) that λ=λ(u) solves
and λh=λh(u) solves
Here x∈G(u)∈X and xh=Gh(u)∈Xh. The estimate of x−xh is induced from Theorem 2.1 as follows:
As an auxiliary function, we consider ζh∈Xh solving
which is the DG discretization of (3.11) in a backward way (see Lemma 3.7 below). Then, by Theorem 2.1, we have
By (3.13), we obtain
and
Combining these estimates with (3.12) and (3.14) gives
where R:I→Rd is given by
and it satisfies ‖R(t)‖=O(hmin{k1,r+1}). This, together with Lemma B.4, yields
Combining this estimate with (3.15),
which completes the proof.
With abusing a notation for simplicity, let us define J as the interval I given a partition 0=s0<s1<⋯<sN−1<sN=T with sj=tN−j. Also we set Xrh,J as the DG space Xrh with the new partition. Then we have the following lemma.
Lemma 3.7. Assume that λ∈Xrh is a solution to
Then W:I→Rd defined by W(t)=λ(T−t) for t∈I=[0,T] satisfies
Proof. By an integration by parts,
which leads to
We now observe that W(t)=λ(T−t) satisfies W′(t)=−λ′(T−t) and [W]N−n=−[λ]n. We also set ψ(t)=ϕ(T−t). Then ψ∈Xrh,J and we have ϕ−n=ψ+N−n. Considering Jn:=(sn−1,sn), it holds that Jn=IN+1−n for 1≤n≤N. Using these notations, we write (3.16) as
Rearranging this, we get
which is the desired equation B(W,ψ)=(F(t,W),ψ)I. The proof is finished.
4.
Second order analysis
In this section, we analyze the second order condition of the functions j and jh, which are essential in the existence and convergence estimates in the next sections.
4.1. Second order condition for j
We defined the solution mapping G:U→X∩L∞(I;Rd) in the previous section. Here we present Lipschitz estimates for the solution mapping G, its derivative G′, and the solution to the adjoint Eq (3.4).
Lemma 4.1. There there exists C>0 such that for all u,ˆu∈Uad and v∈U we have
and
Proof. Let us denote by x=G(u) and ˆx=G(ˆu). Then it follows from (3.2) that
By (1.3), there exists a constant C>0 such that
Using this estimate and applying the Grönwall inequality in (4.1), we get the inequality
This gives the first inequality. For the second one, if we set y=G′(u)v and ˆy=G′(ˆu)v, then it follows from Lemma A.1 that
This together with the first assertion above yields
For notational simplicity, we denote by λ=λ(u) and ˆλ=λ(ˆu). Then we get
with (λ−ˆλ)(T)=0. By applying the Grönwall inequality in a backward way, we obtain
where we used
due to (3.3) and ˆλ(T)=0. This completes the proof.
We now show that the second order condition of j holds near the optimal local solution ˉu∈Uad.
Lemma 4.2. Suppose that ˉu∈Uad satisfies Assumption 1. Then there exists ε>0 such that
holds for all v∈U and all u∈Uad with ‖u−ˉu‖L2(I)≤2ε. Here γ>0 is given in (2.4).
Proof. Let y(t)=G′(u)v and y(ˉu)(t)=G′(ˉu)v. By using Lemma E.1, we find
where we denoted by λ(t):=λ(u)(t), x(t):=G(u)(t), ˉλ(t):=λ(ˉu)(t), and ˉx(t):=G(ˉu)(t). On the other hand, it follows from Lemma 4.1 that
This together with the following estimate
yields
Combining this with (2.4) we have
By choosing ε=γ4C>0 here, we obtain the desired result.
As a consequence of this lemma, we have the following result.
Theorem 4.3. Let ˉu∈Uad satisfy the first optimality condition and Assumption 1. Then, there exist a constant ε>0 such that
for any u∈Uad with ‖u−ˉu‖L2(I)≤2ε.
Proof. Choose ε>0 as in Lemma 4.2. By Taylor's theorem, we get
where ˉus=ˉu+s(u−ˉu) for some s∈[0,1]. On the other hand, the first optimality condition implies
Moreover, we also find
Using these observations and Lemma 4.2, we conclude
The proof is finished.
4.2. Second order condition for jh
In this part, we investigate the second order condition for the discrete cost functional jh. Similarly as in the previous subsection, we first provide the Lipschitz estimates for Gh and the discrete adjoint state.
Lemma 4.4. Let u,ˆu∈Uad and v∈U be given. Then, there exists C>0, independent of h∈(0,1), such that
and
Proof. The first and the third assertions are proved in Lemma B.5. The second estimate is proved in Lemma C.2.
Lemma 4.5. For u∈Uad, let x=G(u) be given by the solution of the state Eq (1.2), and let y=G′(u)v for v∈U. Let xh=Gh(u) be the solution of the discrete state Eq (3.7), and let yh=G′h(u)v. Then we have
Proof. Define ˜y:[0,T]→Rd by the solution to
Recall from Lemma A.1 that y satisfies
Combining these two equations, we get
Using the Grönwall inequality here with (4.2) and (3.13), we find that
On the other hand, yh satisfies
which is the DG discretization of (4.4) in a backward way in view of Lemma 3.7. Thus, we may use Theorem 2.1 to obtain the following error estimate:
This, together with (4.5) gives us the estimate
The proof is finished.
Lemma 4.6. For ε>0 given in Lemma 4.2, there exists h0>0 such that for h∈(0,h0) we have the following inequality
for any u∈Uad satisfying ‖u−ˉu‖L2(I)≤ε.
Proof. We first claim that
for h>0 small enough, where C>0 is independent of h. Let x(t)=G(u)(t), λ(t)=λ(u)(t), xh(t)=Gh(u)(t), and λh(t)=λh(u)(t). Also we let y=G′(u)v and yh=Gh′(u)v. It follows from Lemmas E.1 and E.3 that
In order to show (4.6), by using a similar argument as in the proof of Lemma 4.2, it suffices to show that there exists C>0, independent of h, such that
and
The first and second inequalites in (4.7) hold due to Theorem 2.1 and Lemma 4.5. For the third one in (4.7) is proved in (C.2). By Lemma 3.6, the second inequality in (4.8) holds. We also find
which asserts the first inequality in (4.8). Finally, we obtain
due to (4.7). All of the above estimates enable us to prove the claim (4.6). This together with Lemma 4.2 yields
for 0<h<h0:=γ/(4C). The proof is finished.
5.
Existence and convergence results for the semi-discrete case
We first prove the existence of the local solution to the approximate problem (2.5).
Proof of Theorem 2.3. Choose ε>0 as in Theorem 4.3. We consider the following set
and recall from Section 2 the space VR={u∈U:V(u)≤R}. We will find a minimizer ˉv of jh in the space Wε,R:=¯B2ε(ˉu)∩VR, and then show that ‖ˉv−ˉu‖L2(I)<ε. It will imply that ˉv is a local solution to (2.5).
Since jh is lower bounded on Wε,R, there exists a sequence {vk}k∈N⊂¯Bε(ˉu)∩VR such that
Moreover, since Wε,R is compactly embedded in Lp(I) for any p∈[1,∞), up to a subsequence, there exists a function ˉv∈Wε,R such that {vk} converges to ˉv in L2(I) and converges a.e. to ˉv. By definition, the function zk:=Gh(vk)∈Xrh satisfies
for all φ∈Xrh. Note that {zk}k∈N is a bounded set in the finite dimensional space Xrh by Theorem 2.4 (see also Lemma B.4). Therefore we can find a subsequence such that zk converges uniformly to a function ˉz∈Xrh. We claim that ˉz=Gh(ˉv). Indeed, since vk(t) converges a.e. to ˉv(t) for t∈I and f is Lipschitz continuous, we may take a limit k to infinity in (5.2) to deduce
for all φ∈Xrh. This yields that ˉz=Gh(ˉv), which enables us to derive
This together with (5.1) implies that ˉv∈Wε,R satisfies
It remains to show that the minimizer ˉv∈Wε,R is achieved in the interior of Bε(ˉu)={u∈Uad:‖u−ˉu‖L2(I)<ε}. To show this, we recall that
and
Since ‖G(u)‖W1,∞(I)≤C for all u∈Uad, we see from Theorem 2.1 that
where C>0 is independent of h. Combining this with the Lipschitz continuity of G yields that
Taking h0=γε2/(8C). Using this and the estimate
from Theorem 4.3, it follows that for h∈(0,h0) we have
Thus, the minimizer ˉv is achieved in Bε(ˉu). It gives that jh(u)≥jh(ˉv) for all u∈VR with ‖u−ˉv‖L2≤ε. We now provide the details of the convergence estimate of the approximate solutions.
Proof of Theorem 2.4. Analogous to (4.3), the discrete first order necessary optimality condition for ˉuh∈Uad reads
Inserting here u=ˉu and summing it with (4.3), we get
Now, by applying the mean value theorem with a value t∈(0,1), we have
where we used Lemma 4.6 in the first inequality and (5.4) in the second inequality. For our aim, it only remains to estimate the right hand side. Let us express it using the adjoint states. From (3.5), we have
and it follows from (3.9) that
Here we remind that ˉxh∈Xrh denotes the solution to (2.3) with control ˉu and initial data x0. Combining (5.6) and (5.7) we find
Applying Hölder's inequality here and using (1.3), we deduce
Now we apply (3.10) and (3.13) to get
Combining this with (5.5), we finally obtain
This completes the proof.
6.
Existence and convergence results for the fully discrete case
This section is devoted to the existence and convergence results for the fully discrete case. We consider a finite dimensional space Uh which discretizes the control space Uad, for example, the space of step functions
or the high-order DG space Uh=Xrh∩Uad with r∈N.
We say that ˉuh∈Uh is a local solution to
if there is a value ε>0 such that jh(u)≥jh(ˉuh) for all u∈Uh with ‖u−ˉuh‖L2≤ε.
The existence result of local solution is provided in the following theorem.
Theorem 6.1. Choose ε>0 as in Theorem 4.3. Let ˉu∈Uad be a local solution of (1.4) satisfying Assumption 1. Fix any ε>0. Then there exists h0>0 such that for h∈(0,h0) problem (6.1) has a local solution ˉuh∈Uh such that ‖ˉu−ˉuh‖L2≤ε.
Proof. By compactness and continuity, jh has a minimizer ˉuh in
since Uh is finite dimensional. Next we aim to show that the minimizer ˉuh satisfies
To show this, we recall from (5.3) that there is a value h0>0 such that for h∈(0,h0) we have
Combining this with the minimality of ˉuh for jh in ¯B2ε(ˉu), we find that ‖ˉuh−ˉu‖L2(I)≤ε. It then yields that
Thus ˉuh is a local solution of (6.1).
We establish the convergence result in the following theorem.
Theorem 6.2. Assume the same statements for ˉu∈Uad and λ(ˉu) in Theorem 2.4. In addition, suppose that there exists a projection operator Ph:U→Uh and a value a>0 such that
Let ˉuh∈Uh be a local solution to (6.1) constructed in Theorem 6.1. Then the following estimate holds:
If we further assume that j′(ˉu)=0, then the above estimate can be improved to
Proof. In this case, by the first optimality conditions on ˉu and ˉuh, we have
The latter condition can be written as
where Rh:=jh′(ˉuh)(Phˉu−ˉu). Summing up the above two inequalities provides
i.e.,
By the assumption of the theorem,
On the other hand, by applying the mean value theorem and Lemma 4.6, we obtain
Combining this with (6.2) yields
Applying here the estimate (5.9) in the previous proof, we have
which together with (6.3) gives the desired estimate
When we further assume j′(ˉu)=0, it follows that
Using this and the estimates in (5.8), we find
Inserting this into (6.4) yields
It gives the desired estimate
The proof is done.
7.
Numerical experiments
In this section, we present several numerical experiments which validate our theoretical results. The forward-backward DG methods [18] is employed to solve the examples of the OCPs.
7.1. Linear problem
Let us consider the following simple one dimensional OCP, which has been used as an example [36], that consists of maximizing the functional
subject to the state equation
and U=L2([0,1]). Using a similar idea as in Section 3 based on the maximum principle, we can derive the adjoint equation to the above optimal control problem:
Furthermore, we also find that the optimal solutions ˉu=−λ and ˉx satisfies (7.2). Thus we have the solution
and
For fixed r∈N, we use Xrh for the approximate space of U. In Table 1, we report the discrete L2 error between optimal solutions and its approximations for the above optimal control problem. Here r+1 is the number of grid points on each time interval In, and we used the equidistant points for our numerical computations. The numerical result confirms that the error is of order hr+1 as proved in Theorem 2.4.
7.2. Nonlinear problem
In this part, we consider the following nonlinear optimal control problem:
subject to the state equation
In this case, the corresponding adjoint equation and optimal control are given as follows.
and thus the optimal solution ˉx solves
In this case, since we have no explicit form of the actual solutions, we take the reference solutions ˉxh (resp., ˉuh) with h=(0.1)×2−9 instead of ˉx (resp., ˉu). In Table 2, we arrange the discrete L2 error between reference solutions and its approximations.
Next we consider a two dimensional problem given by
subject to the state equation
In this case, the corresponding adjoint equation and optimal control are given as follows.
This case also has no explicit form of the actual solutions and so we take the reference solutions ˉxh (resp., ˉuh) with h=(0.1)×2−9 instead of ˉx (resp., ˉu). The discrete L2 error between reference solutions and its approximations are arranged in Table 3.
8.
Conclusions
In this paper, we established the analysis for the DG discretization applied to the nonlinear OCP with arbitrary degree of piecewise polynomials r for nonlinear functions f and g with suitable smoothness assumptions. Under the non-degeneracy condition on an optimal control of the OCP, we obtained the existence of the local solution to the approximate problem and the sharp L2-error estimates of the approximated solutions. These results was extended to the fully discrete case, in which the control space is also discretized. Finally, we showed numerical experiments validating our theoretical results. Based on the results of this paper, it would be interesting to analyze the mesh refinement method for the discontinuous galerkin method of the optimal control problems. We would like to investigate this problem in the future.
Acknowledgments
The authors are grateful to the referees for valuable comments on the paper. The work of W. Choi is supported by NRF grants (No. 2016R1A5A1008055) and (No. 2021R1F1A1059671).
Conflict of interest
The authors declare no conflict of interest.
Appendix
A. Differentiability of the control-to-state mapping
In this section, we show that the control-to-state mapping G is twice differentiable, and obtain the derivatives.
Lemma A.1. Let xs=G(u+sv) and y:[0,T]→Rd be the solution of
Then we have
Proof. Recall that xs and x satisfy
respectively. Using this, we find that r(t):=xs(t)−x(t)−sy(t) satisfies
where
and
Given that |xs(t)−x(t)|≤Cs and (1.3), an elementary calculus shows that |A1|≤Cs2 and |A2|≤Cs2. With these bounds, we may apply the Grönwall's lemma for (C.3) to deduce |r(t)|≤Cs2 for t∈[0,T]. From this we find
which yields that
Next we show the twice differentiablity of the mapping s→G(u+sv) at s=0.
Lemma A.2. Let z:[0,T]→Rd be the solution of
Then we have
Proof. Let
Then we get
where
and
By Lemma 4.1, we have |ys(t)−y(t)|≤Cs. Given this estimate and that
from Lemma A.1, an elementary calculus shows that |A1(t)|≤Cs2 and |A2(t)|≤Cs2. Inserting this estimate into (C.5) and applying the Grönwall's lemma, we find
It proves that
This implies that
since
This completes the proof.
B. Grönwall-type inequality for the DG discretization of ODEs
In this section, we provide a Grönwall-type inequality for the DG discretization of ODEs with inputs. It will be used in Section C to establish the differentiability of the discrete control-to-state mapping Gh.
We begin with recalling from [34,Lemma 2.4] the following lemma.
Lemma B.1. Let I=(a,b) and k=b−a>0. Then we have
for all ϕ(t)=(ϕ1(t),⋯,ϕd(t))∈Pr((a,b);Rd), r∈N0, where
The next result is from [34,Lemma 3.1].
Lemma B.2. For I=(a,b) and r∈N0, we have
for all ϕ(t)=(ϕ1(t),⋯,ϕd(t))∈Pr((a,b);Rd). Here C>0 is independent of r, a, b, and d.
We shall use the following Grönwall inequality.
Lemma B.3. Let {an}Nn=1 and {bn}Nn=1 be sequences of non-negative numbers satisfying b1≤b2≤⋯≤bN and b1=0. Assume that for a value h∈(0,1/2) we have
for n∈N. Then there exists a constant Q>0 independent of h∈(0,1/2) and N∈N such that
for any n∈N with n≤N/h.
Proof. The proof can be obtained by induction.
Now we obtain the Grönwall-type inequality.
Lemma B.4. Suppose that
for all φ∈Xrh. Then there exists a constant C>0 independent of h>0 such that
for all u1,u2∈Uad and h>0 small enough.
Proof. From the condition (B.1) we have
for all φ∈Xrh. To obtain the desired estimates, for each n∈{1,⋯,N} we shall take the following test functions φ∈Xrh supported on In given as
where 1In:I→{0,1} denotes the indicator function, that is, 1In(t)=1 for t∈In and 1In(t)=0 for t∈I∖In. First we take φ(t)=x(t)1In(t) for n=1,2,⋯,N. Then,
where for n=1 we abuse a notation [x]0 to mean x+0. Notice that
where for n=1 the above is understood as ([x]0,x+0)=(x+0)2. Using this in (B.2), we find
By applying Cauchy-Schwarz inequality, we obtain
Secondly, we take φ(t)=(t−tn−1)x′(t)1In(t) to have
By using Hölder's inequality, we get
Notice that
Thus, choosing φ(t)=(t−tn−1)1In(t) gives
and subsequently, this yields
where hn=tn−tn−1. This together with Lemma A.1 asserts
for h>0 small enough. Combining (B.3) and (B.4), we find
where we applied Lemma B.1 in the second inequality. This, together with (B.5), we obtain
for h>0 small enough, where for n=1 one has |x−0|=0. This inequality trivially gives
for n=1,⋯,N. Now, by applying Lemma B.3 to find an estimate of |x−1n|2 and inserting it into (B.6), we achieve
Finally, by applying Lemma B.2 to the above, we obtain the desired estimate.
As a corollary, we have the following Lipschitz estimates.
Lemma B.5. For u,v∈Uad we have
and
Proof. Let us denote by x=Gh(u) and ˆx=Gh(v). Then it follows from (2.3) that
By (1.3), there exists a constant C>0 such that
By applying Lemma B.4, we get the inequality
This gives the first inequality. For the second one, we denote by λ=λh(u) and ˆλ=λh(v). Then, we see from Lemma 3.8 that
By applying Lemma B.4 again in a backward way (see Lemma 3.7), we obtain
where we used
due to Lemma B.4. This completes the proof.
C. Differentiability of discrete control-to-state mapping
This section is devoted to prove that the discrete control-to-state mapping Gh is twice differentiable. We also obtain the first and second derivatives of Gh.
Theorem C.1. We denote xsh=Gh(u+sv) and set yh∈Xrh be the solution of the following discretized equation:
where xh=Gh(u). Then we have ddsxsh(t)=yh(t).
Proof. By Theorem 2.1 there exists a solution yh∈Xrh to
By Lemma B.4 we get
Recall that xs and x satisfy
Using this, we find that r(t):=xsh(t)−xh(t)−syh(t) satisfies
for all φ∈Xrh, where
and
Given that |xsh(t)−xh(t)|≤Cs and (1.3), an elementary calculus shows that |A1|≤Cs2 and |A2|≤Cs2. With these bounds, we may apply Lemma B.4 to deduce |r(t)|≤Cs2 for t∈[0,T]. From this we find that
which yields that
This completes the proof.
Lemma C.2. The following holds.
Proof. Let yh=Gh′(u1)v∈Xrh and zh=Gh′(u2)v∈Xrh. Then we obtain
and
for all φ∈Xrh. Combining these equalities, we have
for all φ∈Xrh. On the other hand, the following two inequalities hold:
and
Given these estimates, by applying Lemma B.4 to (C.4), we obtain
where we used Lemma B.5 in the second inequality.
Lemma C.3. Let zh∈Xrh be the solution of the following discretized equation:
for any φ∈Xrh, where yh∈Xrh is the solution of (C.1). Then we have
Proof. Let
It then follows that
where
and
We obtain from Lemma C.2 the estimate |ysh(t)−yh(t)|≤Cs. Upon this estimate and that ddsxsh(t)|s=0=yh(t) from Lemma C.1, an elementary calculus reveals that |A1(t)|≤Cs2 and |A2(t)|≤Cs2. Putting this estimate into (C.5) and using Lemma B.4, we find
This yields that
and so we have
since
The proof is done.
D. Derivations of the first order derivative of cost functionals
In this part, we give the proofs of Lemmas 3.3 and 3.5. Before presenting it, we shall explain how to derive the discrete adjoint Eq (3.8) from the Lagrangian associated to (2.5).
Let us first write the Lagrangian of the problems (1.1) and (3.7) as follows:
for λh∈Xrh, where the bilinear operator B(⋅,⋅) is given by (3.7). If we compute the functional derivatives of the above Lagrangian (D.1) with respect to the adjoint state λh, then δLh/δλh=0 leads (3.7). We now derive the equation of discrete adjoint state. Using the integration by parts, we find
This enables us to rewrite the Lagrangian (D.1) as
and this further implies
for all ψh∈Xrh, where we applied the integration by parts for (ψh,λh′)In to derive the second equality. The above equality corresponds to the adjoint Eq (3.8).
Proof of Lemma 3.3. In order to compute the functional derivative of j with respect to u, we consider j(u+sv)=J(u+sv,G(u+sv)) with v∈U and s∈R+. If we set xs(t):=G(u(t)+sv(t)) it follows from Lemma A.1 that y=ddsxs(t)|s=0 satisfies
with the initial condition y(0)=0. Recall from (3.4) that the adjoint state λ(t)=λ(u)(t) satisfies
Since xs(t) is differentiable with respect to s, the cost j(u+sv) is differentiable with respect to s and it is computed as
where we used
due to (D.3), (D.4), y(0)=0, and λ(T)=0.
Proof of Lemma 3.5. The proof is very similar to Lemma 3.3. We consider jh(u+sv)=J(u+sv,Gh(u+sv)) with v∈U and s∈R+. We recall from Lemma C.1 that the function xsh:=Gh(u+sv) is differentiable at s=0 with
where yh∈Xrh satisfies the following equation:
Using this, we obtain
We then take ψh=yh in (D.2) to get
On the other hand, by using the integration by parts, we find
where B(⋅,⋅) is appeared in (3.6). This yields
due to (D.5). This together with (D.6) concludes
where v∈U.
E. Derivations of the second order derivative of cost functionals
In this appendix, we provide details of the derivation of the second order derivative of cost functional j and its discrete version jh.
Lemma E.1. Let j be the cost functional for the optimal control problems (1.1) and (1.2). Then, for u∈Uad and v∈U, we have
Proof. Similarly as in Appendix D, we consider j(u+sv)=J(u+sv,G(u+sv)) with v∈U and s∈R+ and set xs(t):=G(u(t)+sv(t)). By Lemmas A.1 and A.2, it follows that
where y∈X is given as in (D.3) and z∈X is the solution to
with the initial condition z(0)=0. Then we obtain
On the other hand, we use (D.4) to get
where we used \lambda(T) = 0 and z(0) = 0 . By combining the above with (E.1), we have
This completes the proof.
Remark E.2. Solving the differential Eq (A.1) gives
and thus
where C > 0 depends only on \|f\|_{L^\infty(0, T; W^{1, \infty})} and T > 0 . This estimate for y enables to bound the first four integrals on the right hand side of (E.2) by
where C > 0 depends only on \|\lambda\|_{L^\infty(I)} , T , \|f\|_{L^\infty(0, T; W^{2, \infty})} , and \|g\|_{L^\infty(0, T; W^{2, \infty})} . This implies that if g is given by
then we have
which satisfies (2.4) if \gamma > C/2 . It would be interesting to develop a numerical method to check (2.4) for general case.
Next we proceed the similar calculation for the approximate solution.
Lemma E.3. Let j_h be the discrete cost functional for the optimal control problems (1.1) and (1.2). Then, for u \in \mathcal{U}_{ad} and v \in \mathcal{U} , we have
Proof. Similarly as in the proof of Lemma 3.5, we consider j_h(u+sv) = J(u+sv, G_h(u + sv)) with v \in \mathcal{U} and s \in \mathbb{R}_+ and set x_{h}^s: = G_h(u + sv) . We recall from Theorem C.1 and Theorem C.3 that
where z_h \in X^r_h satisfies
Now a straightforward computation gives
Note that the discrete adjoint state \lambda_h(t) = \lambda_h(u)(t) satisfies
for all \psi \in X^r_h . Thus by considering \psi = z_h \in X^r_h , we find
Combining the above equalities, we have
This completes the proof.