Loading [MathJax]/jax/output/SVG/jax.js
Research article

A sharp error analysis for the DG method of optimal control problems

  • Received: 07 September 2021 Revised: 28 February 2022 Accepted: 02 March 2022 Published: 09 March 2022
  • MSC : 49J15, 49M25, 65L05, 65L60

  • In this paper, we are concerned with a nonlinear optimal control problem of ordinary differential equations. We consider a discretization of the problem with the discontinuous Galerkin method with arbitrary order rN{0}. Under suitable regularity assumptions on the cost functional and solutions of the state equations, we first show the existence of a local solution to the discretized problem. We then provide sharp estimates for the L2-error of the approximate solutions. The convergence rate of the error depends on the regularity of the optimal solution ˉu and its adjoint state with the degree of piecewise polynomials. Numerical experiments are presented supporting the theoretical results.

    Citation: Woocheol Choi, Young-Pil Choi. A sharp error analysis for the DG method of optimal control problems[J]. AIMS Mathematics, 2022, 7(5): 9117-9155. doi: 10.3934/math.2022506

    Related Papers:

    [1] Cagnur Corekli . The SIPG method of Dirichlet boundary optimal control problems with weakly imposed boundary conditions. AIMS Mathematics, 2022, 7(4): 6711-6742. doi: 10.3934/math.2022375
    [2] Lingling Sun, Hai Bi, Yidu Yang . A posteriori error estimates of mixed discontinuous Galerkin method for a class of Stokes eigenvalue problems. AIMS Mathematics, 2023, 8(9): 21270-21297. doi: 10.3934/math.20231084
    [3] Liangkun Xu, Hai Bi . A multigrid discretization scheme of discontinuous Galerkin method for the Steklov-Lamé eigenproblem. AIMS Mathematics, 2023, 8(6): 14207-14231. doi: 10.3934/math.2023727
    [4] Zuliang Lu, Xiankui Wu, Fei Huang, Fei Cai, Chunjuan Hou, Yin Yang . Convergence and quasi-optimality based on an adaptive finite element method for the bilinear optimal control problem. AIMS Mathematics, 2021, 6(9): 9510-9535. doi: 10.3934/math.2021553
    [5] Song Lunji . A High-Order Symmetric Interior Penalty Discontinuous Galerkin Scheme to Simulate Vortex Dominated Incompressible Fluid Flow. AIMS Mathematics, 2016, 1(1): 43-63. doi: 10.3934/Math.2016.1.43
    [6] Tiantian Zhang, Wenwen Xu, Xindong Li, Yan Wang . Multipoint flux mixed finite element method for parabolic optimal control problems. AIMS Mathematics, 2022, 7(9): 17461-17474. doi: 10.3934/math.2022962
    [7] Zuliang Lu, Ruixiang Xu, Chunjuan Hou, Lu Xing . A priori error estimates of finite volume element method for bilinear parabolic optimal control problem. AIMS Mathematics, 2023, 8(8): 19374-19390. doi: 10.3934/math.2023988
    [8] Zuliang Lu, Fei Cai, Ruixiang Xu, Lu Xing . Convergence and proposed optimality of adaptive finite element methods for nonlinear optimal control problems. AIMS Mathematics, 2022, 7(11): 19664-19695. doi: 10.3934/math.20221079
    [9] Zonghong Xiong, Wei Wei, Ying Zhou, Yue Wang, Yumei Liao . Optimal control for a phase field model of melting arising from inductive heating. AIMS Mathematics, 2022, 7(1): 121-142. doi: 10.3934/math.2022007
    [10] Şuayip Toprakseven, Seza Dinibutun . Error estimations of a weak Galerkin finite element method for a linear system of $ \ell\geq 2 $ coupled singularly perturbed reaction-diffusion equations in the energy and balanced norms. AIMS Mathematics, 2023, 8(7): 15427-15465. doi: 10.3934/math.2023788
  • In this paper, we are concerned with a nonlinear optimal control problem of ordinary differential equations. We consider a discretization of the problem with the discontinuous Galerkin method with arbitrary order rN{0}. Under suitable regularity assumptions on the cost functional and solutions of the state equations, we first show the existence of a local solution to the discretized problem. We then provide sharp estimates for the L2-error of the approximate solutions. The convergence rate of the error depends on the regularity of the optimal solution ˉu and its adjoint state with the degree of piecewise polynomials. Numerical experiments are presented supporting the theoretical results.



    In the present work, we discuss discontinuous Galerkin (DG) approximations to a nonlinear optimal control problem (OCP) of ordinary differential equations (ODEs). More precisely, we consider the following optimal control problem:

    Minimize J(u,x):=T0g(t,x(t),u(t))dt, (1.1)

    subject to

    {x(t)=f(t,x(t),u(t)),a.e. on [0,T],x(0)=x0,uUad,a.e. on [0,T]. (1.2)

    Here u(t)Rm is the control, and x(t)Rd is the state of the system at time t[0,T]. Further, g:[0,T]×Rd×RmR and f:[0,T]×Rd×RmRd are given, and the set of admissible controls UadU:=L(0,T;Rm) is given by

    Uad:={u(t)Rm:uu(t)uu}

    for some u,uuRm. Here the inequality is understood in the component-wise sense.

    There have been a lot of study on the numerical computation for the above problem. The numerical schemes need a discretization of the ODEs, for example, the Euler discretization for the OCPs of ODEs are well studied for sufficiently smooth optimal controls based on strong second-order optimality conditions [2,13,14]. For optimal control problems with control appearing linearly, the optimal control may be discontinuous, for an instance, bang-bang controller, and such conditions may not be satisfied. In that respect, there have been many studies to develop new second-order optimality conditions for the optimal control problems with control appearing linearly [3,21,31,32]. The second-order Runge-Kutta approximations for the OPCs was studied in [15]. Recently, works [16,17] developed a novel stability technique to obtain new error estimates for the Euler discretization of OCPs.

    The Pseudo-spectral method is also popularly used for the discretization due to its capability of high-order accuracy for smooth solutions to the OCPs [20,33]. However, the high-order accuracy of the Pseudo-spectral method is known to be often lost for bang-bang OCPs, where the solutions may not be smooth enough. To handle this issue, Henriques et al. [24] proposed a mesh refinement method based on a high-order DG method for the OCPs of ODEs. The DG method discretizes the time interval in small time subintervals, in which the weak formulation is employed. The test functions are usually taken as piecewise polynomials which can be discontinuous at boundaries of the time interval, see Section 2 for more detailed discussion. We refer to [7,19,34] and references therein for DG methods for ODEs. It is also worth to refer to papers for the analysis of the discretization of optimal control problems of PDEs, for example, the elliptic problems [1,23,35] and the parabolic problems [9,12,25,26,27,28,29]. In addition, the recent works [22,30] studied the discretization of the optimal control for fractional diffusion problems.

    In this paper, we provide a rigorous analysis for the DG discretization applied to the nonlinear OCPs (1.1) and (1.2) with arbitrary order rN{0} for general functions f and g with suitable smoothness. Motivated from a recent work by Neitzel and Vexler [29], we impose the non-degeneracy condition (2.4) on an optimal control ˉu of the OCPs (1.1) and (1.2). We obtain the existence and convergence results for the semi-discretized case and the fully discretized case. The rates of the convergence results depend on the regularity of the optimal solution ˉu and its adjoint state with the degree of piecewise polynomials mentioned above, see Section 2 for details.

    It is worth noticing that the control is not required to be linear in the state Eq (1.2), and the control space Uad allows to take into account discontinuous controls. The constraints for controls are defined by lower and upper bounds. Moreover, the cost functional is also given in a general form, not limited to be quadratic. We mention that the DG discretization of zeroth order was used in [29] for the optimal control problem for the semi-linear parabolic equation where the control is linearly applied to the system.

    For notational simplicity, we denote by I:=(0,T), X:=L2(I;Rd), and (v,w)I=(v,w)L2(I;Rd). We also use simplified notations:

    Lp(I):=Lp(I;Rd)andWp,(I):=Wp,(I;Rd)

    for 1p. Throughout this paper, for any compact set KRm, we assume that f,gC([0,T];W3,(Rd×K)) satisfy

    sup0tT(f(t,,)W3,+g(t,,)W3,)M (1.3)

    for some M>0.

    We next introduce the control-to-state mapping G:UXL(I;Rd), G(u)=x, with x solving (1.2). It induces the cost functional j:UR+, uJ(u,G(u)). This makes the optimal control problems (1.1) and (1.2) equivalent to

     Minimize j(u) subject to uUad. (1.4)

    Definition 1.1. A control ˉuUad is a local solution of (1.4) if there exists a constant ε>0 such that j(u)j(ˉu) holds for all uUad with ˉuuL2(I)ε.

    In the proof of the existence and convergence results, the main task is to show that the strong convexity of j induced by the second-order optimality condition (2.4) is preserved near the optimal control ˉu and also for its DG discretized version jh. It is achieved using the second-order analysis in Section 4. As a preliminary, we also justify that j and jh are twice differentiable, by showing the differentiability of the control-to-state mapping G and its discretized version Gh in the appendix.

    In Section 2, we explain the DG discretization of the ODEs and the OCP. Then we present the main results for the semi-discretized case and provide some preliminary results. In Section 3, the adjoint problems are studied. Section 4 is devoted to study the second order analysis of the cost functionals j and jh. In Section 5, we prove the existence of the local solution and obtain the convergence rate for the semi-discretized case. Section 6 is devoted to establish the existence and convergence results for the fully discretized case. Finally, in Section 7, we perform several numerical experiments for linear and nonlinear OCPs. In Appendix A, we obtain first and second order derivatives of the control-to-state mapping G. Appendix B is devoted to prove a Grönwall-type inequality for the discretization of the ODEs (1.2) involving the control variable. It is used in Appendix C to establish the differentiability of the discrete control-to-state mapping Gh and obtain the derivatives. In Appendix D, we prove Lemmas 3.3 and 3.5, which reformulate the first derivatives of the cost functionals in terms of the adjoint states. In Appendix E, we derive the formulas on the second order derivatives of the cost functionals.

    In this section, we describe the approximation of the OCPs (1.1) and (1.2) with the DG method, and then we state the main results on the semi-discrete case. First, we illustrate the discretization of the ordinary differential equations

    {x(t)=F(t,x(t)),t(0,T),x(0)=x0, (2.1)

    where x:[0,T]Rd, F:(0,T)×RdRd is uniformly Lipschitz continuous with respect to x, i.e.,

    F(t,u)F(t,v)Luv,u,vRd,t(0,T),

    with a constant L>0. By the Cauchy Lipschitz theorem, we have the existence and uniqueness of classical solution x of (2.1).

    Given an integer NN, we consider a partition of I into N-intervals {In}Nn=1 given by In=(tn1,tn) with nodes 0=:t0<t1<<tN1<tN:=T. Let hn be the length of In, i.e., hn=tntn1, and we set h:=max1nNhn. For a piecewise continuous function φ:[0,T]Rd, we also define

    φ+n:=limt0+φ(tn+t),0nN1,φn:=limt0+φ(tnt),1nN.

    The jumps across the nodes is denoted by [φ]n:=φ+nφn for 1nN1. For rN{0}, we define

    Xrh:={φhX:φh|InPr(In),1nN},

    where Pr(In) represents the set of all polynomials of t up to order r defined on In with coefficients in Rd. Then the DG approximate solution xh of (2.1) is given as

    Nn=1(x(t)F(t,x(t)),φ(t))In+Nn=2([x]n1,φ+n1)+(x+0,φ+0)=(x0,φ+0) (2.2)

    for all φXrh. Here (,) denotes the inner product in Rd, and

    (A(t),B(t))In=In(A(t),B(t))dt

    for integrable functions A,B:InRd.

    We recall the error estimate for the DG approximation of (2.1) from [34,Corollary 3.15 & Theorem 2.6].

    Theorem 2.1. Let x(t) be the solution of (2.1) such that xWk,(I;Rd) for some k1. Suppose that hL<1. Then there exists a unique DG approximate solution xhXrh to (2.2) of order rN{0}. Furthermore, we have

    sup0tT|xh(t)x(t)|Chmin{r+1,k}xWk,(I;Rd),

    where C>0 is determined by L, T, and r.

    Now, for given uU, we consider the approximate solution xXrh of the control problem (1.2) satisfying

    Nn=1(x(t)f(t,x(t),u(t)),φ(t))In+Nn=2([x]n1,φ+n1)+(x+0,φ+0)=(x0,φ+0) (2.3)

    for all φXrh.

    Throughout the paper, we will consider local solutions ˉu to (1.4) satisfying the following non-degeneracy condition.

    Assumption 1. Let ˉuUad be the local solution of (1.1). We assume that it satisfies

    j(ˉu)(v,v)γv2L2(I),vU (2.4)

    for some γ>0.

    The differentiability of the cost functional j(u)=J(u,G(u)) with respect to uU is induced by the differentiability of the solution mapping G(u) justified in Appendix A (see also the proofs of Lemmas 3.3 and E.1). Note that the above second-order optimality condition holds under suitable regularity assumptions on the function f, g, and solutions, see Remark E.2 for a detailed discussion. We refer to [4,5] for further discussion on the second-order condition and also [8,10,11] for the optimal control problem of PDEs.

    In addition, we assume that ˉuUad has bounded total variation, i.e., V(ˉu)R/2 for a fixed value R>0. Here the total variation V(f) for fL(0,T) is defined as

    V(f):=supPnj=0|f(xj)f(xj+1)|,

    where P is any partition P={0=x0<x1<x2<<xn<xn+1=T}.

    Considering a discrete control-to-state mapping Gh:UXrh, uGh(u), where Gh(u) is the solution of (2.3), we introduce the discrete cost functional jh:UR+,uJ(u,Gh(u)). Let us consider the following discretized version of (1.1):

    minuUadVRjh(u), (2.5)

    where

    VR={uU:V(u)R}.

    We now define the local solution to (2.5) as follows.

    Definition 2.2. A control ˉuhUadVR is called a local solution of (2.5) if there exists an δ>0 such that jh(u)jh(ˉuh) holds for all uUadVR with uˉuhL2(I)δ.

    In the first main result, we prove the existence of the local solution to the approximate problem (2.5).

    Theorem 2.3. Let ˉuUadVR/2 be a local solution of (1.1) satisfying Assumption 1. Then, there are constants ϵ>0 and h0>0 such that for h(0,h0) the approximate problem (2.5) has a local solution ˉuhUadVR satisfying ˉuhˉuL2(I)<ε.

    The second main result is the following convergence estimate of the approximate solutions.

    Theorem 2.4. Let ˉuUadVR/2 be a local solution of (1.4) satisfying Assumption 1, let ˉuh be the approximate solution found in Theorem 2.3, and let λ(ˉu) be the adjoint state defined in Definition 3.1 below. Assume that the state ˉx=G(ˉu) belongs to Wk1,(I;Rd) and the adjoint state λ(ˉu) belongs to Wk2,(I;Rd) for some k1,k21. Then we have

    ˉuˉuhL2(I)=O(hmin{r+1,k1,k2}).

    The required regularity of solutions ˉx and λ(ˉu) can be obtained under suitable smoothness assumptions on f, g, and ˉu, see Remark 3.2 below. The above result establishes the error estimate concerning the discretization of the ODEs in the OCPs. We will give the proofs of Theorems 2.3 and 2.4 in Section 5. On the other hand, to implement a numerical computation to the OCP (1.4), one needs also to consider an approximation of the control space with a finite dimensional space. In Section 6, we will see that the proof of Theorem 2.4 can be extended to the error analysis incorporating the discretization of the control space.

    This section is devoted to study the adjoint states to the OCP (1.1) and its discretized version (2.5).

    We introduce a bilinear form b(,) for xW1,(0,T) and φX by

    b(x,φ):=T0x(t)φ(t)dt. (3.1)

    Then, for a fixed control uU and initial data x0Rd, a weak formulation of (1.2) can be written as

    b(x,φ)=T0f(t,x(t),u(t))φ(t)dt (3.2)

    for all φX with x(0)=x0.

    Definition 3.1. For a control uU, we define the adjoint state λ=λ(u)W1,(0,T) as the solution to

    λ(t)=xf(t,x(t),u(t))λ(t)+xg(t,x(t),u(t)), (3.3)

    with λ(T)=0. It satisfies the weak formulation

    b(φ,λ)=(φ,xf(,x,u)λxg(,x,u))L2(I) (3.4)

    for all φX with λ(T)=0.

    Remark 3.2. It follows from the Eqs (1.2) and (3.3) that if

    fCαb(R+×Rd×Rd),gCβb(R+×Rd×Rd)anduCγb([0,T]),

    we have

    xCmin{α,γ}+1b([0,T])andλCmin{α,β,γ+1}b([0,T]).

    For u,vU, the derivative of j at u in the direction v is defined by

    j(u)v:=limt0+j(u+tv)j(u)t.

    It is well-known that the derivative of the cost functional can be calculated with the adjoint state, as described below.

    Lemma 3.3. We have

    j(u)(v)=(ug(,x,u)uf(,x,u)λ(u),v)I (3.5)

    for all vUad, where x=G(u).

    Proof. For the completeness of the paper, we give the proof in Appendix D.

    Next we describe the adjoint problem for the approximate problem (2.5). For x,φXrh, we define

    B(x,φ):=Nn=1(x,φ)In+Nn=2([x]n1,φ+n1)+(x+0,φ+0). (3.6)

    For approximate solution xh=Gh(u)Xrh, the Eq (2.3) with control uU can be written as

    B(xh,φ)=(f(,xh,u),φ)I+(x0,φ+0),φXrh. (3.7)

    Now we define the adjoint equation for the approximate problem (2.5).

    Definition 3.4. The adjoint state λh=λh(u)Xrh is defined as the solution of the following discrete adjoint equation:

    B(φ,λh)=(φ,xf(,xh,u)λhxg(,xh,u))I,φXrh. (3.8)

    In Appendix D, we briefly explain how the adjoint Eq (3.8) can be derived from the Lagrangian related to (2.5). We also have an analogous result to Lemma 3.3.

    Lemma 3.5. We have

    jh(u)(v)=(ug(,xh,u)uf(,xh,u)λh,v)I,vUad, (3.9)

    where xh=Gh(u).

    Proof. The proof is given in Appendix D.

    In order to prove the main results in Section 2, we shall use the following lemma.

    Lemma 3.6. Let uU. Suppose that x=G(u)Wk1,(I;Rd) and λ=λ(u)Wk2,(I;Rd) for some k1,k21. Then we have

    λ(u)λh(u)L2(I)=O(hmin{k1,k2,r+1}). (3.10)

    Proof. We recall from (3.4) and (3.8) that λ=λ(u) solves

    b(φ,λ)(φ,xf(,x,u)λ)L2(I)=(φ,xg(,x,u))I, (3.11)

    and λh=λh(u) solves

    B(φ,λh)(φ,xf(,x,u)λh)L2(I)=(φ,xg(,xh,u))L2(I)+(φ,(xf(,xh,u)xf(,x,u))λh)L2(I),φXrh. (3.12)

    Here xG(u)X and xh=Gh(u)Xh. The estimate of xxh is induced from Theorem 2.1 as follows:

    xxhL(I)=O(hmin{k1,r+1})xWk1,(I). (3.13)

    As an auxiliary function, we consider ζhXh solving

    B(φ,ζh)(φ,xf(,x,u)ζh)I=(φ,xg(,x,u))I,φXrh, (3.14)

    which is the DG discretization of (3.11) in a backward way (see Lemma 3.7 below). Then, by Theorem 2.1, we have

    ζhλL(I)=O(hmin{k2,r+1})λWk2,(I). (3.15)

    By (3.13), we obtain

    xg(,x,u)xg(,xh,u)=O(hmin{k1,r+1})

    and

    (xf(,xh,u)xf(,x,u))λh(u)=O(hmin{k1,r+1}).

    Combining these estimates with (3.12) and (3.14) gives

    B(φ,λhζh)=(φ,xf(,x,u)(λhζh))I+(φ,R(t))I,φXrh,

    where R:IRd is given by

    R(t)=(xf(,xh,u)xf(,x,u))λh(u)+xg(,x,u)xg(,xh,u),

    and it satisfies R(t)=O(hmin{k1,r+1}). This, together with Lemma B.4, yields

    λhζhL(I)=O(hmin{k1,r+1}).

    Combining this estimate with (3.15),

    λhλL(I)=O(hmin{k1,k2,r+1}),

    which completes the proof.

    With abusing a notation for simplicity, let us define J as the interval I given a partition 0=s0<s1<<sN1<sN=T with sj=tNj. Also we set Xrh,J as the DG space Xrh with the new partition. Then we have the following lemma.

    Lemma 3.7. Assume that λXrh is a solution to

    B(ϕ,λ)=(ϕ,F(t,λ))I,ϕXrh.

    Then W:IRd defined by W(t)=λ(Tt) for tI=[0,T] satisfies

    B(W,ψ)=(F(t,W),ψ)I,ψXrh,J.

    Proof. By an integration by parts,

    B(ϕ,λ)=Nn=1(ϕ,λ)In+Nn=2([ϕ]n1,λ+n1)+(ϕ+0,λ+0)=Nn=1(ϕ,λ)InN1n=1(ϕn,[λ]n)+(ϕN,λN),

    which leads to

    Nn=1(ϕ,λ)InN1n=1(ϕn,[λ]n)+(ϕN,λN)=(ϕ,F(t,λ))I,ϕXrh. (3.16)

    We now observe that W(t)=λ(Tt) satisfies W(t)=λ(Tt) and [W]Nn=[λ]n. We also set ψ(t)=ϕ(Tt). Then ψXrh,J and we have ϕn=ψ+Nn. Considering Jn:=(sn1,sn), it holds that Jn=IN+1n for 1nN. Using these notations, we write (3.16) as

    Nn=1(ψ,W)JN+1n+N1n=1(ψ+Nn,[W]Nn)+(ψ+0,W+0)=(ψ,F(t,W))I,ψXrh,J.

    Rearranging this, we get

    Nn=1(W,ψ)Jn+N1n=1([W]n,ψ+n)+(W+0,ψ+0)=(F(t,W),ψ)I,ψXrh,J,

    which is the desired equation B(W,ψ)=(F(t,W),ψ)I. The proof is finished.

    In this section, we analyze the second order condition of the functions j and jh, which are essential in the existence and convergence estimates in the next sections.

    We defined the solution mapping G:UXL(I;Rd) in the previous section. Here we present Lipschitz estimates for the solution mapping G, its derivative G, and the solution to the adjoint Eq (3.4).

    Lemma 4.1. There there exists C>0 such that for all u,ˆuUad and vU we have

    G(u)G(ˆu)L(I)CuˆuL2(I),G(u)vG(ˆu)vL(I)CuˆuL2(I)vL2(I),

    and

    λ(u)λ(ˆu)L(I)CuˆuL2(I).

    Proof. Let us denote by x=G(u) and ˆx=G(ˆu). Then it follows from (3.2) that

    (xˆx)(t)=f(t,x(t),u(t))f(t,ˆx(t),ˆu(t)). (4.1)

    By (1.3), there exists a constant C>0 such that

    |f(t,x(t),u(t))f(t,ˆx(t),ˆu(t))|C|ˆx(t)x(t)|+C|ˆu(t)u(t)|.

    Using this estimate and applying the Grönwall inequality in (4.1), we get the inequality

    xˆxL(I)CuˆuL1(I)CuˆuL2(I).

    This gives the first inequality. For the second one, if we set y=G(u)v and ˆy=G(ˆu)v, then it follows from Lemma A.1 that

    (yˆy)(t)=xf(t,x(t),u(t))(yˆy)(t)+(xf(t,x,u)xf(t,ˆx,ˆu))ˆy(t)+(uf(t,x,u)uf(t,ˆx,ˆu))v(t).

    This together with the first assertion above yields

    yˆyL(I)C(xf(,x,u)xf(,ˆx,ˆu))ˆyL1(I)+C(uf(,x,u)uf(,ˆx,ˆu))vL1(I)C(xˆxL2(I)+uˆuL2(I))vL2(I)CuˆuL2(I)vL2(I).

    For notational simplicity, we denote by λ=λ(u) and ˆλ=λ(ˆu). Then we get

    (λˆλ)(t)=xf(,x,u)(λˆλ)(t)+(xf(,x,u)xf(,ˆx,ˆu))ˆλ(t)(xg(,x,u)xg(,ˆx,ˆu))(t),t(0,T),

    with (λˆλ)(T)=0. By applying the Grönwall inequality in a backward way, we obtain

    λˆλL(I)C(xf(,x,u)xf(,ˆx,ˆu))ˆλL1(I)+Cxg(,x,u)xg(,ˆx,ˆu)L1(I)C(ˆλL(I)+1)(xˆxL(I)+uˆuL2(I))CuˆuL2(I),

    where we used

    ˆλL(I)CxgL(I)

    due to (3.3) and ˆλ(T)=0. This completes the proof.

    We now show that the second order condition of j holds near the optimal local solution ˉuUad.

    Lemma 4.2. Suppose that ˉuUad satisfies Assumption 1. Then there exists ε>0 such that

    j(u)(v,v)γ2v2L2(I)

    holds for all vU and all uUad with uˉuL2(I)2ε. Here γ>0 is given in (2.4).

    Proof. Let y(t)=G(u)v and y(ˉu)(t)=G(ˉu)v. By using Lemma E.1, we find

    j(u)(v,v)j(ˉu)(v,v)=T0λ(t)(2f(x)2(t,x,u)y2(t)+22fxu(t,x,u)y(t)v(t)+2f(u)2(t,x,u)v2(t))dt+T02g(x)2(t,x,u)y2(t)+22gxu(t,x,u)y(t)v(t)+2g(u)2(t,x,u)v2(t)dt+T0ˉλ(t)(2f(x)2(t,ˉx,u)ˉy2(t)+22fxu(t,ˉx,u)ˉy(t)v(t)+2f(u)2(t,ˉx,u)v2(t))dtT02g(x)2(t,ˉx,u)ˉy2(t)+22gxu(t,ˉx,u)ˉy(t)v(t)+2g(u)2(t,ˉx,u)v2(t)dt,

    where we denoted by λ(t):=λ(u)(t), x(t):=G(u)(t), ˉλ(t):=λ(ˉu)(t), and ˉx(t):=G(ˉu)(t). On the other hand, it follows from Lemma 4.1 that

    xˉxL(I)CuˉuL2(I),yˉyL(I)CuˉuL2(I)vL2(I),yL(I)CvL2(I),λL(I)+ˉλL(I)CxgL(I),andλˉλL(I)CuˉuL2(I). (4.2)

    This together with the following estimate

    T0|y2(t)ˉy2(t)|dtT0|y(t)+ˉy(t)||y(t)ˉy(t)|dtyˉyL2(I)(yL2(I)+ˉyL2(I))CuˉuL2(I)v2L2(I),

    yields

    |(j(u)(v,v)j(ˉu)(v,v))|CuˉuL2(I)v2L2(I).

    Combining this with (2.4) we have

    j(u)(v,v)=j(ˉu)(v,v)+(j(u)(v,v)j(ˉu)(v,v))γv2L2(I)CuˉuL2(I)v2L2(I).

    By choosing ε=γ4C>0 here, we obtain the desired result.

    As a consequence of this lemma, we have the following result.

    Theorem 4.3. Let ˉuUad satisfy the first optimality condition and Assumption 1. Then, there exist a constant ε>0 such that

    j(u)j(ˉu)+γ2uˉu2L2(I)

    for any uUad with uˉuL2(I)2ε.

    Proof. Choose ε>0 as in Lemma 4.2. By Taylor's theorem, we get

    j(u)=j(ˉu)+j(ˉu)(uˉu)+12j(ˉus)(uˉu,uˉu),

    where ˉus=ˉu+s(uˉu) for some s[0,1]. On the other hand, the first optimality condition implies

    j(ˉu)(uˉu)0,uUad. (4.3)

    Moreover, we also find

    ˉuˉusL2(I)suˉuL2(I)2ε.

    Using these observations and Lemma 4.2, we conclude

    j(u)j(ˉu)+γ2uˉu2L2(I).

    The proof is finished.

    In this part, we investigate the second order condition for the discrete cost functional jh. Similarly as in the previous subsection, we first provide the Lipschitz estimates for Gh and the discrete adjoint state.

    Lemma 4.4. Let u,ˆuUad and vU be given. Then, there exists C>0, independent of h(0,1), such that

    Gh(u)Gh(ˆu)L(I)CuˆuL2(I),Gh(u)vGh(ˆu)vL2(I)CuˆuL2(I)vL2(I),

    and

    λh(u)λh(ˆu)L(I)CuˆuL2(I).

    Proof. The first and the third assertions are proved in Lemma B.5. The second estimate is proved in Lemma C.2.

    Lemma 4.5. For uUad, let x=G(u) be given by the solution of the state Eq (1.2), and let y=G(u)v for vU. Let xh=Gh(u) be the solution of the discrete state Eq (3.7), and let yh=Gh(u)v. Then we have

    yhyL(I)ChvL2(I).

    Proof. Define ˜y:[0,T]Rd by the solution to

    ˜y(t)=xf(,xh,u)˜y(t)+uf(,xh,u)v(t),˜y(0)=0. (4.4)

    Recall from Lemma A.1 that y satisfies

    y(t)=xf(,x,u)y+uf(,x,u)v,y(0)=0.

    Combining these two equations, we get

    (˜yy)(t)=xf(t,xh,u)(˜yy)(t)+(xf(t,xh,u)xf(t,x,u))y(t)+(uf(t,xh,u)uf(t,x,u)v(t).

    Using the Grönwall inequality here with (4.2) and (3.13), we find that

    ˜yyL(I)CxhxL(I)(yL2(I)+vL2(I))CxhxL(I)vL2(I)ChvL2(I). (4.5)

    On the other hand, yh satisfies

    B(yh,φ)=(xf(,xh,u)yh+uf(,xh,u)v,φ)I,φXrh,

    which is the DG discretization of (4.4) in a backward way in view of Lemma 3.7. Thus, we may use Theorem 2.1 to obtain the following error estimate:

    ˜yyhL(I)ChvL2(I).

    This, together with (4.5) gives us the estimate

    yhyL(I)˜yyL(I)+˜yyhL(I)ChvL2(I).

    The proof is finished.

    Lemma 4.6. For ε>0 given in Lemma 4.2, there exists h0>0 such that for h(0,h0) we have the following inequality

    jh(u)(v,v)γ4v2L2(I),vU

    for any uUad satisfying uˉuL2(I)ε.

    Proof. We first claim that

    |j(u)(v,v)jh(u)(v,v)|Chv2L2(I) (4.6)

    for h>0 small enough, where C>0 is independent of h. Let x(t)=G(u)(t), λ(t)=λ(u)(t), xh(t)=Gh(u)(t), and λh(t)=λh(u)(t). Also we let y=G(u)v and yh=Gh(u)v. It follows from Lemmas E.1 and E.3 that

    j(u)(v,v)jh(u)(v,v)=T0λ(t)(2f(x)2(t,x,u)y2(t)+22fxu(t,x,u)y(t)v(t)+2f(u)2(t,x,u)v2(t))dt+T02g(x)2(t,x,u)y2(t)+22gxu(t,x,u)y(t)v(t)+2g(u)2(t,x,u)v2(t)dt+T0λh(t)(2f(x)2(t,xh,u)y2h(t)+22fxu(t,xh,u)yh(t)v(t)+2f(u)2(t,xh,u)v2(t))dtT02g(x)2(t,xh,u)y2h(t)+22gxu(t,xh,u)yh(t)v(t)+2g(u)2(t,xh,u)v2(t)dt.

    In order to show (4.6), by using a similar argument as in the proof of Lemma 4.2, it suffices to show that there exists C>0, independent of h, such that

    xxhL(I)Ch,yyhL(I)ChvL2(I),yhL2(I)CvL2(I), (4.7)
    λhL(I)C,λλhL(I)Ch, (4.8)

    and

    T0|y2(t)y2h(t)|dtChv2L2(I).

    The first and second inequalites in (4.7) hold due to Theorem 2.1 and Lemma 4.5. For the third one in (4.7) is proved in (C.2). By Lemma 3.6, the second inequality in (4.8) holds. We also find

    λhL(I)λλhL(I)+λL(I)Ch+CC,

    which asserts the first inequality in (4.8). Finally, we obtain

    T0|y2(t)y2h(t)|dtT0|y(t)+yh(t)||y(t)yh(t)|dty(t)yh(t)L2(I)(yL2(I)+yhL2(I))Chv2L2(I),

    due to (4.7). All of the above estimates enable us to prove the claim (4.6). This together with Lemma 4.2 yields

    jh(u)(v,v)j(u)(v,v)|jh(u)(v,v)j(u)(v,v)|γ2v2L2(I)Chv2L2(I)γ4v2L2(I)

    for 0<h<h0:=γ/(4C). The proof is finished.

    We first prove the existence of the local solution to the approximate problem (2.5).

    Proof of Theorem 2.3. Choose ε>0 as in Theorem 4.3. We consider the following set

    ¯B2ε(ˉu)={uUad:uˉuL2(I)2ε},

    and recall from Section 2 the space VR={uU:V(u)R}. We will find a minimizer ˉv of jh in the space Wε,R:=¯B2ε(ˉu)VR, and then show that ˉvˉuL2(I)<ε. It will imply that ˉv is a local solution to (2.5).

    Since jh is lower bounded on Wε,R, there exists a sequence {vk}kN¯Bε(ˉu)VR such that

    limkjh(vk)=infvWε,Rjh(v). (5.1)

    Moreover, since Wε,R is compactly embedded in Lp(I) for any p[1,), up to a subsequence, there exists a function ˉvWε,R such that {vk} converges to ˉv in L2(I) and converges a.e. to ˉv. By definition, the function zk:=Gh(vk)Xrh satisfies

    Nn=1(zk(t)f(t,zk(t),vk(t)),φ(t))In+Nn=2([zk]n1,φ+n1)+(zk+0,φ+0)=(zk0,φ+0) (5.2)

    for all φXrh. Note that {zk}kN is a bounded set in the finite dimensional space Xrh by Theorem 2.4 (see also Lemma B.4). Therefore we can find a subsequence such that zk converges uniformly to a function ˉzXrh. We claim that ˉz=Gh(ˉv). Indeed, since vk(t) converges a.e. to ˉv(t) for tI and f is Lipschitz continuous, we may take a limit k to infinity in (5.2) to deduce

    Nn=1(ˉz(t)f(t,ˉz(t),ˉv(t)),φ(t))In+Nn=2([ˉz]n1,φ+n1)+(ˉz+0,φ+0)=(ˉz0,φ+0)

    for all φXrh. This yields that ˉz=Gh(ˉv), which enables us to derive

    limkjh(vk)=limkT0g(t,zk(t),vk(t))dt=T0limkg(t,zk(t),vk(t))dt=T0g(t,ˉz(t),ˉv(t))dt=T0g(t,Gh(ˉv)(t),ˉv(t))dt=jh(ˉv).

    This together with (5.1) implies that ˉvWε,R satisfies

    jh(ˉv)=infvWε,Rjh(v).

    It remains to show that the minimizer ˉvWε,R is achieved in the interior of Bε(ˉu)={uUad:uˉuL2(I)<ε}. To show this, we recall that

    j(u)=J(u,G(u))=T0g(t,G(u)(t),u(t))dt

    and

    jh(u)=J(u,Gh(u))=T0g(t,Gh(u)(t),u(t))dt.

    Since G(u)W1,(I)C for all uUad, we see from Theorem 2.1 that

    Gh(u)G(u)L(I)ChG(u)W1,(I)Ch,

    where C>0 is independent of h. Combining this with the Lipschitz continuity of G yields that

    |j(u)jh(u)|Ch,uUad.

    Taking h0=γε2/(8C). Using this and the estimate

    j(u)j(ˉu)+γ2ε2,uUadwithεuˉuL2(I)2ε

    from Theorem 4.3, it follows that for h(0,h0) we have

    jh(u)jh(ˉu)+γ4ε2,uUadwithεuˉuL2(I)2ε. (5.3)

    Thus, the minimizer ˉv is achieved in Bε(ˉu). It gives that jh(u)jh(ˉv) for all uVR with uˉvL2ε. We now provide the details of the convergence estimate of the approximate solutions.

    Proof of Theorem 2.4. Analogous to (4.3), the discrete first order necessary optimality condition for ˉuhUad reads

    jh(ˉuh)(uˉuh)0,uBε(ˉuh)VR.

    Inserting here u=ˉu and summing it with (4.3), we get

    0(j(ˉu)jh(ˉuh))(ˉuhˉu)=(j(ˉu)jh(ˉu))(ˉuhˉu)+(jh(ˉu)jh(ˉuh))(ˉuhˉu). (5.4)

    Now, by applying the mean value theorem with a value t(0,1), we have

    Cˉuhˉu2L2(I)jh(ˉut(ˉuˉuh))(ˉuhˉu,ˉuhˉu)=(jh(ˉuh)jh(ˉu))(ˉuhˉu)(j(ˉu)jh(ˉu))(ˉuhˉu), (5.5)

    where we used Lemma 4.6 in the first inequality and (5.4) in the second inequality. For our aim, it only remains to estimate the right hand side. Let us express it using the adjoint states. From (3.5), we have

    j(ˉu)(ˉuhˉu)=(ug(,ˉx,ˉu)uf(,ˉx,ˉu)λ(ˉu),ˉuhˉu)I, (5.6)

    and it follows from (3.9) that

    jh(ˉu)(ˉuhˉu)=(ug(,ˉxh,ˉu)uf(,ˉxh,ˉu)λh(ˉu),ˉuhˉu)I. (5.7)

    Here we remind that ˉxhXrh denotes the solution to (2.3) with control ˉu and initial data x0. Combining (5.6) and (5.7) we find

    (j(ˉu)jh(ˉu))(ˉuhˉu)=(ug(,ˉx,ˉu)ug(,ˉxh,ˉu),ˉuhˉu)I(uf(,ˉx,ˉu)λ(ˉu)uf(,ˉxh,ˉu)λh(ˉu),ˉuhˉu)I.

    Applying Hölder's inequality here and using (1.3), we deduce

    (j(ˉu)jh(ˉu))(ˉuhˉu)uxgLˉxˉxhL2(I)ˉuhˉuL2(I)+λ(ˉu)L(I)uf(,ˉx,ˉu)uf(,ˉxh,ˉu)L2(I)ˉuhˉuL2(I)+uf(,ˉxh,ˉu)Lλ(ˉu)λh(ˉu)L2(I)ˉuhˉuL2(I)C(ˉxˉxhL2(I)+λ(ˉu)λh(ˉu)L2(I))ˉuhˉuL2(I). (5.8)

    Now we apply (3.10) and (3.13) to get

    (j(ˉu)jh(ˉu))(ˉuhˉu)Chmin{k1,k2,r+1}ˉuhˉuL2(I). (5.9)

    Combining this with (5.5), we finally obtain

    ˉuhˉuL2(I)Chmin{k1,k2,r+1}.

    This completes the proof.

    This section is devoted to the existence and convergence results for the fully discrete case. We consider a finite dimensional space Uh which discretizes the control space Uad, for example, the space of step functions

    Uh={uUadu:piecewise constant onIk=[tk1,tk]},

    or the high-order DG space Uh=XrhUad with rN.

    We say that ˉuhUh is a local solution to

    minuUhjh(u), (6.1)

    if there is a value ε>0 such that jh(u)jh(ˉuh) for all uUh with uˉuhL2ε.

    The existence result of local solution is provided in the following theorem.

    Theorem 6.1. Choose ε>0 as in Theorem 4.3. Let ˉuUad be a local solution of (1.4) satisfying Assumption 1. Fix any ε>0. Then there exists h0>0 such that for h(0,h0) problem (6.1) has a local solution ˉuhUh such that ˉuˉuhL2ε.

    Proof. By compactness and continuity, jh has a minimizer ˉuh in

    ¯B2ε(ˉu)={uUh:uˉuL2(I)2ε},

    since Uh is finite dimensional. Next we aim to show that the minimizer ˉuh satisfies

    ˉuhˉuL2(I)ε.

    To show this, we recall from (5.3) that there is a value h0>0 such that for h(0,h0) we have

    jh(u)jh(ˉu)+γ4ε2,uUadwithεuˉuL2(I)2ε.

    Combining this with the minimality of ˉuh for jh in ¯B2ε(ˉu), we find that ˉuhˉuL2(I)ε. It then yields that

    jh(u)jh(ˉuh),uUhwithuˉuhL2ε.

    Thus ˉuh is a local solution of (6.1).

    We establish the convergence result in the following theorem.

    Theorem 6.2. Assume the same statements for ˉuUad and λ(ˉu) in Theorem 2.4. In addition, suppose that there exists a projection operator Ph:UUh and a value a>0 such that

    PhˉuˉuL2(I)=O(ha)forh(0,1).

    Let ˉuhUh be a local solution to (6.1) constructed in Theorem 6.1. Then the following estimate holds:

    ˉuhˉuL2(I)=O(hmin{r+1,k1,k2,a/2}).

    If we further assume that j(ˉu)=0, then the above estimate can be improved to

    ˉuhˉuL2(I)=O(hmin{r+1,k1,k2,a}).

    Proof. In this case, by the first optimality conditions on ˉu and ˉuh, we have

    j(ˉu)(ˉuhˉu)0andjh(ˉuh)(Phˉuˉuh)0.

    The latter condition can be written as

    0jh(ˉuh)(ˉuˉuh)+jh(ˉuh)(Phˉuˉu)=jh(ˉuh)(ˉuˉuh)+Rh,

    where Rh:=jh(ˉuh)(Phˉuˉu). Summing up the above two inequalities provides

    0(j(ˉu)jh(ˉuh))(ˉuhˉu)+Rh=(j(ˉu)jh(ˉu))(ˉuhˉu)+(jh(ˉu)jh(ˉuh))(ˉuhˉu)+Rh,

    i.e.,

    (jh(ˉuh)jh(ˉu))(ˉuhˉu)(j(ˉu)jh(ˉu))(ˉuhˉu)+Rh. (6.2)

    By the assumption of the theorem,

    RhL2(I)=O(ha). (6.3)

    On the other hand, by applying the mean value theorem and Lemma 4.6, we obtain

    (jh(ˉuh)jh(ˉu))(ˉuhˉu)=jh(ˉu+t(ˉuˉuh))(ˉuhˉu,ˉuhˉu)Cˉuhˉu2L2(I).

    Combining this with (6.2) yields

    ˉuhˉu2L2(I)C(j(ˉu)jh(ˉu))(ˉuhˉu)+CRh.

    Applying here the estimate (5.9) in the previous proof, we have

    ˉuhˉu2L2(I)Chmin{k1,k2,r+1}ˉuhˉuL2(I)+CRh, (6.4)

    which together with (6.3) gives the desired estimate

    ˉuhˉuL2(I)=O(hmin{r+1,k1,k2,a/2}).

    When we further assume j(ˉu)=0, it follows that

    jh(ˉuh)=(jh(ˉuh)jh(ˉu))+(jh(ˉu)j(ˉu)).

    Using this and the estimates in (5.8), we find

    |Rh|=|jh(ˉuh)(Phˉuˉu)|C(ˉuhˉuL2(I)+hmin{k1,k2,r+1})PhˉuˉuL2(I)Cha(ˉuhˉuL2(I)+hmin{k1,k2,r+1}).

    Inserting this into (6.4) yields

    ˉuhˉu2L2(I)Chmin{k1,k2,r+1}ˉuhˉuL2(I)+Cha(ˉuhˉuL2(I)+hmin{k1,k2,r+1}).

    It gives the desired estimate

    ˉuhˉuL2(I)=O(hmin{r+1,k1,k2,a}).

    The proof is done.

    In this section, we present several numerical experiments which validate our theoretical results. The forward-backward DG methods [18] is employed to solve the examples of the OCPs.

    Let us consider the following simple one dimensional OCP, which has been used as an example [36], that consists of maximizing the functional

    J=1210x2(t)+u2(t)dt

    subject to the state equation

    x(t)=x(t)+u(t),x(0)=1, (7.1)

    and U=L2([0,1]). Using a similar idea as in Section 3 based on the maximum principle, we can derive the adjoint equation to the above optimal control problem:

    λ(t)=λ(t)x(t),λ(1)=0.

    Furthermore, we also find that the optimal solutions ˉu=λ and ˉx satisfies (7.2). Thus we have the solution

    ˉx(t)=2cosh(2(t1))sinh(2(t1))2cosh(2)+sinh(2)

    and

    ˉu(t)=sinh(2(t1)2cosh(2)+sinh(2).

    For fixed rN, we use Xrh for the approximate space of U. In Table 1, we report the discrete L2 error between optimal solutions and its approximations for the above optimal control problem. Here r+1 is the number of grid points on each time interval In, and we used the equidistant points for our numerical computations. The numerical result confirms that the error is of order hr+1 as proved in Theorem 2.4.

    Table 1.  Discrete L2 error: ˉxˉxhL2(I) and ˉuˉuhL2(I).
    h ˉxˉxhL2(I) ˉuˉuhL2(I) log2ˉxˉx2hˉxˉxh log2ˉuˉu2hˉuˉuh
    (0.1)×20 1.9455e-03 6.2543e-04
    (0.1)×21 4.8861e-04 1.6088e-04 2.00 1.96
    (0.1)×22 1.2240e-04 4.0780e-05 2.00 1.98
    r=1 (0.1)×23 3.0629e-05 1.0264e-05 2.00 1.99
    (0.1)×24 7.6607e-06 2.5748e-06 2.00 2.00
    (0.1)×25 1.9156e-06 6.4477e-07 2.00 2.00
    (0.1)×20 2.6708e-05 1.3269e-05
    (0.1)×21 3.3523e-06 1.6837e-06 2.99 2.98
    (0.1)×22 4.1979e-07 2.1202e-07 3.00 2.99
    r=2 (0.1)×23 5.2518e-08 2.6599e-08 3.00 3.00
    (0.1)×24 6.5673e-09 3.3308e-09 3.00 3.00
    (0.1)×25 8.2108e-10 4.1672e-10 3.00 3.00
    (0.1)×20 2.8964e-07 9.5564e-08
    (0.1)×21 1.8172e-08 6.0617e-09 4.00 3.98
    (0.1)×22 1.1377e-09 3.8151e-10 4.00 3.99
    r=3 (0.1)×23 7.1152e-11 2.3918e-11 4.00 4.00
    (0.1)×24 4.4370e-12 1.4871e-12 4.00 4.01
    (0.1)×25 2.7555e-13 8.4657e-14 4.01 4.13

     | Show Table
    DownLoad: CSV

    In this part, we consider the following nonlinear optimal control problem:

    J=121/50x2(t)+u2(t)dt

    subject to the state equation

    x(t)=x2(t)+u(t),x(0)=2. (7.2)

    In this case, the corresponding adjoint equation and optimal control are given as follows.

    λ(t)=x(t)(1+2λ(t))andˉu(t)=λ(t),

    and thus the optimal solution ˉx solves

    x(t)=x2(t)λ(t),x(0)=2.

    In this case, since we have no explicit form of the actual solutions, we take the reference solutions ˉxh (resp., ˉuh) with h=(0.1)×29 instead of ˉx (resp., ˉu). In Table 2, we arrange the discrete L2 error between reference solutions and its approximations.

    Table 2.  Discrete L2 error: ˉxˉxhL2(I) and ˉuˉuhL2(I).
    h ˉxˉxhL2(I) ˉuˉuhL2(I) log2ˉxˉx2hˉxˉxh log2ˉuˉu2hˉuˉuh
    0.1 1.3006e-02 2.6587e-03
    (0.1)×21 4.5715e-03 6.8872e-04 1.51 1.95
    (0.1)×22 1.3286e-03 1.7024e-04 1.78 2.02
    r=1 (0.1)×23 3.5677e-04 4.2187e-05 1.90 2.01
    (0.1)×24 9.2305e-05 1.0492e-05 1.95 2.01
    (0.1)×25 2.3420e-05 2.6101e-06 1.98 2.01
    0.1 7.9288e-04 7.1751e-05
    (0.1)×21 1.6928e-04 6.8412e-06 2.23 3.40
    (0.1)×22 2.7566e-05 7.2059e-07 2.62 3.25
    r=2 (0.1)×23 3.9391e-06 8.4373e-08 2.81 3.10
    (0.1)×24 5.2676e-07 1.0332e-08 2.90 3.03
    (0.1)×25 6.8107e-08 1.2833e-09 2.95 3.01
    0.1 4.8978e-05 2.3326e-06
    (0.1)×21 5.8217e-06 2.0158e-07 3.07 3.53
    (0.1)×22 5.0236e-07 1.3655e-08 3.53 3.88
    r=3 (0.1)×23 3.6929e-08 8.7619e-10 3.77 3.96
    (0.1)×24 2.5037e-09 5.5551e-11 3.88 3.98
    (0.1)×25 1.6329e-10 3.6858e-12 3.94 3.91

     | Show Table
    DownLoad: CSV

    Next we consider a two dimensional problem given by

    J=121/50(x2(t)+y2(t)+u2(t))dt,

    subject to the state equation

    x(t)=x2(t)+y(t),x(0)=2,y(y)=3y(t)+u(t),y(0)=1. (7.3)

    In this case, the corresponding adjoint equation and optimal control are given as follows.

    λ1(t)=x(t)(1+2λ(t))andˉu(t)=λ2(t),λ2(t)=y(t)λ1(t)+3λ2(t). (7.4)

    This case also has no explicit form of the actual solutions and so we take the reference solutions ˉxh (resp., ˉuh) with h=(0.1)×29 instead of ˉx (resp., ˉu). The discrete L2 error between reference solutions and its approximations are arranged in Table 3.

    Table 3.  Discrete L2 error: ˉxˉxhL2(I) and ˉuˉuhL2(I).
    h ˉxˉxhL2(I) ˉuˉuhL2(I) log2ˉxˉx2hˉxˉxh log2ˉuˉu2hˉuˉuh
    0.1 5.6850e-03 3.6402e-03
    (0.1)×21 1.6706e-03 1.1148e-03 1.48 1.71
    (0.1)×22 4.5109e-04 2.9952e-04 1.77 1.90
    r=1 (0.1)×23 1.1702e-04 7.7189e-05 1.89 1.96
    (0.1)×24 2.9736e-05 1.9566e-05 1.95 1.98
    (0.1)×25 7.4372e-06 4.9221e-06 1.98 1.99
    0.1 1.1860e-03 2.9482e-05
    (0.1)×21 2.5679e-04 2.6302e-06 2.21 3.49
    (0.1)×22 4.2605e-05 3.5132e-07 2.59 2.90
    r=2 (0.1)×23 6.1623e-06 4.8266e-08 2.79 2.86
    (0.1)×24 8.2960e-07 6.3722e-09 2.89 2.92
    (0.1)×25 1.0764e-07 8.1940e-10 2.95 2.96
    0.1 7.3645e-05 1.0438e-06
    (0.1)×21 9.4018e-06 6.7811e-08 2.97 3.94
    (0.1)×22 8.4549e-07 4.1743e-09 3.48 4.02
    r=3 (0.1)×23 6.3517e-08 2.5778e-10 3.73 4.02
    (0.1)×24 4.3534e-09 1.6014e-11 3.87 4.01
    (0.1)×25 2.8493e-10 9.9925e-13 3.93 4.00

     | Show Table
    DownLoad: CSV

    In this paper, we established the analysis for the DG discretization applied to the nonlinear OCP with arbitrary degree of piecewise polynomials r for nonlinear functions f and g with suitable smoothness assumptions. Under the non-degeneracy condition on an optimal control of the OCP, we obtained the existence of the local solution to the approximate problem and the sharp L2-error estimates of the approximated solutions. These results was extended to the fully discrete case, in which the control space is also discretized. Finally, we showed numerical experiments validating our theoretical results. Based on the results of this paper, it would be interesting to analyze the mesh refinement method for the discontinuous galerkin method of the optimal control problems. We would like to investigate this problem in the future.

    The authors are grateful to the referees for valuable comments on the paper. The work of W. Choi is supported by NRF grants (No. 2016R1A5A1008055) and (No. 2021R1F1A1059671).

    The authors declare no conflict of interest.

    In this section, we show that the control-to-state mapping G is twice differentiable, and obtain the derivatives.

    Lemma A.1. Let xs=G(u+sv) and y:[0,T]Rd be the solution of

    y(t)=fx(t,x(t),u(t))y(t)+fu(t,x(t),u(t))v(t),t(0,T),y(0)=0. (A.1)

    Then we have

    ddsG(u+sv)|s=0=y.

    Proof. Recall that xs and x satisfy

    (xs)(t)=f(t,xs(t),u(t)+sv(t))andx(t)=f(t,x(t),u(t)),

    respectively. Using this, we find that r(t):=xs(t)x(t)sy(t) satisfies

    (xs(t)x(t)sy(t))(t)=f(t,xs,u+sv)f(t,x,u)s(fx(t,x,u)y(t)+fu(t,x,u)v(t))=:fx(t,x,u)(xs(t)x(t)sy(t))+A1(t)+A2(t), (A.2)

    where

    A1(t):=f(t,xs,u)f(t,x,u)fx(t,x,u)(xs(t)x(t)),

    and

    A2(t):=f(t,xs,u+sv)f(t,xs,u)sfu(t,x,u)v(t).

    Given that |xs(t)x(t)|Cs and (1.3), an elementary calculus shows that |A1|Cs2 and |A2|Cs2. With these bounds, we may apply the Grönwall's lemma for (C.3) to deduce |r(t)|Cs2 for t[0,T]. From this we find

    lims0xs(t)x(t)sy(t)s=0,

    which yields that

    ddsxs(t)=y(t).

    Next we show the twice differentiablity of the mapping sG(u+sv) at s=0.

    Lemma A.2. Let z:[0,T]Rd be the solution of

    z(t)=2f(x)2(t,x(t),u(t))y2(t)+22fxu(t,x(t),u(t))y(t)v(t)+2f(u)2(t,x(t),u(t))v2(t)+fx(t,x(t),u(t))z(t),z(0)=0.

    Then we have

    d2(ds)2G(u+sv)|s=0=z(t).

    Proof. Let

    ys(t)=ddsG(u+sv)andy(t)=ddsG(u+sv)|s=0.

    Then we get

    (ys)(t)y(t)sz(t)=fx(t,xs,u+sv)ys(t)+fu(t,xs,u+sv)v(t)fx(t,x,u)y(t)fu(t,x,u)v(t)s[2f(x)2(t,x(t),u)y2(t)+22fxu(t,x(t),u)y(t)v(t)+2f(u)2(t,x(t),u)v2(t)+fx(t,x(t),u)z(t)]=:fx(t,x(t),u)(ys(t)y(t)sz(t))+A1(t)+A2(t), (A.3)

    where

    A1(t):=[fx(t,xs,u+sv)fx(t,x,u)]ys(t)s[2f(x)2(t,x,u)y(t)+2fxu(t,x,u)v(t)]y(t)

    and

    A2(t):=[fu(t,xs,u+sv)fu(t,x(t),u)]v(t)s[2f(u)2(t,x,u)v(t)+2fxu(t,x,u)y(t)]v(t).

    By Lemma 4.1, we have |ys(t)y(t)|Cs. Given this estimate and that

    ddsxs(t)|s=0=y(t)

    from Lemma A.1, an elementary calculus shows that |A1(t)|Cs2 and |A2(t)|Cs2. Inserting this estimate into (C.5) and applying the Grönwall's lemma, we find

    ys(t)y(t)sz(t)=O(s2).

    It proves that

    ddsys(t)|s=0=z(t).

    This implies that

    d2(ds)2G(u+sv)|s=0=z(t)

    since

    ys(t)=ddsG(u+sv).

    This completes the proof.

    In this section, we provide a Grönwall-type inequality for the DG discretization of ODEs with inputs. It will be used in Section C to establish the differentiability of the discrete control-to-state mapping Gh.

    We begin with recalling from [34,Lemma 2.4] the following lemma.

    Lemma B.1. Let I=(a,b) and k=ba>0. Then we have

    ba|ϕ(t)|2dt1kdi=1(baϕi(t)dt)2+12ba(bt)(ta)|ϕ(t)|2dt

    for all ϕ(t)=(ϕ1(t),,ϕd(t))Pr((a,b);Rd), rN0, where

    Pr((a,b);Rd)={(p1,,pd):pk:(a,b)Risapolynomialoforderr}.

    The next result is from [34,Lemma 3.1].

    Lemma B.2. For I=(a,b) and rN0, we have

    ϕ2L(I)Clog(r+1)ba|ϕ(t)|2(ta)dt+C|ϕ(b)|2

    for all ϕ(t)=(ϕ1(t),,ϕd(t))Pr((a,b);Rd). Here C>0 is independent of r, a, b, and d.

    We shall use the following Grönwall inequality.

    Lemma B.3. Let {an}Nn=1 and {bn}Nn=1 be sequences of non-negative numbers satisfying b1b2bN and b1=0. Assume that for a value h(0,1/2) we have

    (1h)bn+1bn+an

    for nN. Then there exists a constant Q>0 independent of h(0,1/2) and NN such that

    bneQ(nh)nk=1ak

    for any nN with nN/h.

    Proof. The proof can be obtained by induction.

    Now we obtain the Grönwall-type inequality.

    Lemma B.4. Suppose that

    |B(x,φ)|CNn=1(|(x(t),φ(t))In|+|(u(t),φ(t))In|) (B.1)

    for all φXrh. Then there exists a constant C>0 independent of h>0 such that

    xL(I)CuL2(I)

    for all u1,u2Uad and h>0 small enough.

    Proof. From the condition (B.1) we have

    |Nn=1(x(t),φ(t))In+Nn=2([x]n1,φ+n1)In+(x+0,φ+0)I1|CNn=1|(x(t),φ(t))In|+|(u(t),φ(t))In|

    for all φXrh. To obtain the desired estimates, for each n{1,,N} we shall take the following test functions φXrh supported on In given as

    φ(t)=(x1x2)(t)1In(t),φ(t)=(ttn1)(x1x2)(t)1In(t),andφ(t)=(ttn1)1In(t),

    where 1In:I{0,1} denotes the indicator function, that is, 1In(t)=1 for tIn and 1In(t)=0 for tIIn. First we take φ(t)=x(t)1In(t) for n=1,2,,N. Then,

    (x(t),x(t))In+([x]n1,x+n1)C|(x(t),x(t))In|+|(u(t),x(t))In|, (B.2)

    where for n=1 we abuse a notation [x]0 to mean x+0. Notice that

    ([x]n1,x+n1)=(x+n1)2(xn1,x+n1),

    where for n=1 the above is understood as ([x]0,x+0)=(x+0)2. Using this in (B.2), we find

    12|xn|212|x+n1|2+|x+n1|2(xn1,x+n1)+C|(x(t),x(t))In|+|(u(t),x(t))In|.

    By applying Cauchy-Schwarz inequality, we obtain

    12|xn|212|xn1|2+Cx(t)2L2(In)+Cu(t)2L2(In). (B.3)

    Secondly, we take φ(t)=(ttn1)x(t)1In(t) to have

    (x(t),(ttn1)x(t))In(x(t),(ttn1)x(t))In+(u(t),(ttn1)x(t))In.

    By using Hölder's inequality, we get

    In(ttn1)|x(t)|2dtIn|ttn1|(|x(t)|2+|u(t)|2)dt. (B.4)

    Notice that

    (x(t),(ttn1))In=Inx(t)dt+x(tn)(tntn1).

    Thus, choosing φ(t)=(ttn1)1In(t) gives

    |Inx(t)dt+x(tn)(tntn1)|CIn|x(t)|(ttn1)dt+CIn|u(t)|(ttn1)dt,

    and subsequently, this yields

    |Inx(t)dt|22h2n|xn|2+2In(x(t)2+u(t)2)dtIn(tn1t)2dt2h2n|xn|2+Ch3nIn(x(t)2+u(t)2)dt,

    where hn=tntn1. This together with Lemma A.1 asserts

    |Inx(t)dt|22h2n|xn|2+Ch4nIn(ttn1)|x(t)|2dt+Ch3nIn|u(t)|2dt (B.5)

    for h>0 small enough. Combining (B.3) and (B.4), we find

    In(ttn1)|x(t)|2dt+|xn|2Cx2L2(In)+CIn|u(t)|2dt+|xn1|2Chn|Inx(t)dt|2+ChnIn(ttn1)|x(t)|2dt+|xn1|2+CIn|u(t)|2dt,

    where we applied Lemma B.1 in the second inequality. This, together with (B.5), we obtain

    12In(ttn1)|x(t)|2dt+|xn|2Chn|xn|2+|xn1|2+CIn|u(t)|2dt (B.6)

    for h>0 small enough, where for n=1 one has |x0|=0. This inequality trivially gives

    |xn|2Chn|xn|2+|xn1|2+CIn|u(t)|2dt

    for n=1,,N. Now, by applying Lemma B.3 to find an estimate of |x1n|2 and inserting it into (B.6), we achieve

    12In(ttn1)|x(t)|2dt+|xn|2CT0|u(t)|2dt.

    Finally, by applying Lemma B.2 to the above, we obtain the desired estimate.

    As a corollary, we have the following Lipschitz estimates.

    Lemma B.5. For u,vUad we have

    Gh(u)Gh(v)L(I)CuvL2(I)

    and

    λh(u)λh(v)L(I)CuvL2(I).

    Proof. Let us denote by x=Gh(u) and ˆx=Gh(v). Then it follows from (2.3) that

    B((xˆx),φ)=(f(t,x(t),u(t))f(t,ˆx(t),ˆu(t)),φ),φXrh.

    By (1.3), there exists a constant C>0 such that

    |f(t,x(t),u(t))f(t,ˆx(t),ˆu(t))|C|ˆx(t)x(t)|+C|ˆu(t)u(t)|.

    By applying Lemma B.4, we get the inequality

    xˆxL(I)CuˆuL2(I).

    This gives the first inequality. For the second one, we denote by λ=λh(u) and ˆλ=λh(v). Then, we see from Lemma 3.8 that

    B(φ,(λˆλ))=(φ,xf(,x,u)(λˆλ)(t)+(xf(,x,u)xf(,ˆx,ˆu))(t)(xg(,x,u)xg(,ˆx,ˆu)))I,φXrh.

    By applying Lemma B.4 again in a backward way (see Lemma 3.7), we obtain

    λˆλL(I)C(xf(,x,u)xf(,ˆx,ˆu))ˆλL2(I)+Cxg(,x,u)xg(,ˆx,ˆu)L2(I)C(ˆλL(I)+1)(xˆxL(I)+uˆuL2(I))CuˆuL2(I),

    where we used

    ˆλL(I)CxgL(I),

    due to Lemma B.4. This completes the proof.

    This section is devoted to prove that the discrete control-to-state mapping Gh is twice differentiable. We also obtain the first and second derivatives of Gh.

    Theorem C.1. We denote xsh=Gh(u+sv) and set yhXrh be the solution of the following discretized equation:

    B(yh,φ)=(fx(t,xh,u)yh(t)+fu(t,xh,u)v(t),φ(t))I,φXrh, (C.1)

    where xh=Gh(u). Then we have ddsxsh(t)=yh(t).

    Proof. By Theorem 2.1 there exists a solution yhXrh to

    B(yh,φ)=(fx(t,xh,u)yh(t)+fu(t,xh,u)v(t),φ(t))I,φXrh.

    By Lemma B.4 we get

    yhL(I)CvL2(I). (C.2)

    Recall that xs and x satisfy

    B(xsh,φ)=(f(t,xs,u+sv),φ(t))IandB(xh(t),φ)=(f(t,x,u),φ(t))I.

    Using this, we find that r(t):=xsh(t)xh(t)syh(t) satisfies

    B((xshxhsyh),φ)=(f(t,xsh,u+sv)f(t,xh,u)s(fx(t,xh,u)y(t)+fu(t,xh,u)v(t)),φ(t))=(fx(t,xh,u)(xsh(t)xh(t)syh(t))+A1+A2,φ(t)) (C.3)

    for all φXrh, where

    A1=f(t,xsh,u)f(t,xh,u)fx(t,xh,u)(xsh(t)xh(t))

    and

    A2=f(t,xsh,u+sv)f(t,xsh,u)sfu(t,xh,u)v(t).

    Given that |xsh(t)xh(t)|Cs and (1.3), an elementary calculus shows that |A1|Cs2 and |A2|Cs2. With these bounds, we may apply Lemma B.4 to deduce |r(t)|Cs2 for t[0,T]. From this we find that

    lims0xsh(t)xh(t)syh(t)s=0,

    which yields that

    ddsxsh(t)=yh(t).

    This completes the proof.

    Lemma C.2. The following holds.

    Gh(u1)vGh(u2)vL(I)Cu1u2L2(I)vL(I).

    Proof. Let yh=Gh(u1)vXrh and zh=Gh(u2)vXrh. Then we obtain

    B(yh,φ)=(fx(t,Gh(u1),u1)yh(t)+fu(t,Gh(u1),u1)v(t),φ(t))I

    and

    B(zh,φ)=(fx(t,Gh(u2),u2)zh(t)+fu(t,Gh(u2),u2)v(t),φ(t))I

    for all φXrh. Combining these equalities, we have

    B(yhzh,φ)=(fx(t,Gh(u1),u1)(yhzh)(t),φ(t))I+((fx(t,Gh(u1),u1)fx(t,Gh(u2),u2))zh(t),φ(t))I+((fu(t,Gh(u1),u1)fu(t,Gh(u2),u2))v(t),φ(t))I (C.4)

    for all φXrh. On the other hand, the following two inequalities hold:

    |(fx(t,Gh(u1),u1)fx(t,Gh(u2),u2))zh(t)|C(|u1u2|+|Gh(u1)Gh(u2)|)|zh(t)|

    and

    |(fu(t,Gh(u1),u1)fu(t,Gh(u2),u2))v(t)|C(|u1u2|+|Gh(u1)Gh(u2)|)|v(t)|.

    Given these estimates, by applying Lemma B.4 to (C.4), we obtain

    yhzhL(I)C(|u1u2|+|Gh(u1)Gh(u2)|)|zh(t)|L2(I)+C(|u1u2|+|Gh(u1)Gh(u2)|)|v(t)|L2(I)Cu1u2L2(I)vL(I),

    where we used Lemma B.5 in the second inequality.

    Lemma C.3. Let zhXrh be the solution of the following discretized equation:

    B(zh,φ) =T0(2f(x)2(t,xh,u)y2h(t)+22fxu(t,xh,u)yh(t)v(t)+2f(u)2(t,xh,u)v2(t))φ(t)dt +T0fx(t,xh,u)zh(t)φ(t)dt

    for any φXrh, where yhXrh is the solution of (C.1). Then we have

    d2(ds)2Gh(u+sv)|s=0=zh(t).

    Proof. Let

    ysh(t)=ddsGh(u+sv)andyh(t)=ddsGh(u+sv)|s=0.

    It then follows that

    B((ysh)(t)yh(t)szh(t),φ(t))=:(fx(t,xh(t),u)(ysh(t)yh(t)szh(t))+A1(t)+A2(t),φ(t)), (C.5)

    where

    A1(t):=[fx(t,xsh,u+sv)fx(t,xh,u)]ysh(t)s[2f(x)2(t,xh,u)yh(t)+2fxu(t,xh,u)v(t)]yh(t)

    and

    A2(t):=[fu(t,xsh,u+sv)fu(t,xh,u)]v(t)s[2f(u)2(t,xh,u)v(t)+2fxu(t,xh,u)yh(t)]v(t).

    We obtain from Lemma C.2 the estimate |ysh(t)yh(t)|Cs. Upon this estimate and that ddsxsh(t)|s=0=yh(t) from Lemma C.1, an elementary calculus reveals that |A1(t)|Cs2 and |A2(t)|Cs2. Putting this estimate into (C.5) and using Lemma B.4, we find

    ys(t)y(t)sz(t)=O(s2).

    This yields that

    ddsysh(t)|s=0=zh(t),

    and so we have

    d2(ds)2Gh(u+sv)|s=0=zh(t)

    since

    ysh(t)=ddsGh(u+sv).

    The proof is done.

    In this part, we give the proofs of Lemmas 3.3 and 3.5. Before presenting it, we shall explain how to derive the discrete adjoint Eq (3.8) from the Lagrangian associated to (2.5).

    Let us first write the Lagrangian of the problems (1.1) and (3.7) as follows:

    Lh(xh,u,λh):=T0g(t,xh(t),u(t))dt+B(xh,λh)(f(,xh,u),λh)I(x0,λ+h,0) (D.1)

    for λhXrh, where the bilinear operator B(,) is given by (3.7). If we compute the functional derivatives of the above Lagrangian (D.1) with respect to the adjoint state λh, then δLh/δλh=0 leads (3.7). We now derive the equation of discrete adjoint state. Using the integration by parts, we find

    B(xh,λh)=Nn=1(xh,λh)InN1n=1(xh,n,[λh]n)+(xh,N,λh,N).

    This enables us to rewrite the Lagrangian (D.1) as

    Lh(xh,u,λh)=T0g(t,xh(t),u(t))dtNn=1(xh,λh)In(f(,xh,u),λh)IN1n=1(xh,n,[λh]n)+(xh,N,λh,N)(x0,λ+h,0),

    and this further implies

    0=δLh(xh,u,λh)δxh(ψh)=T0gx(t,xh(t),u(t))ψh(t)dtNn=1(ψh,λh)In(fx(,xh,u)ψh,λh)IN1n=1(ψh,n,[λh]n)+(ψh,N,λh,N)=T0gx(t,xh(t),u(t))ψh(t)dt(fx(,xh,u)ψh,λh)I+B(ψh,λh) (D.2)

    for all ψhXrh, where we applied the integration by parts for (ψh,λh)In to derive the second equality. The above equality corresponds to the adjoint Eq (3.8).

    Proof of Lemma 3.3. In order to compute the functional derivative of j with respect to u, we consider j(u+sv)=J(u+sv,G(u+sv)) with vU and sR+. If we set xs(t):=G(u(t)+sv(t)) it follows from Lemma A.1 that y=ddsxs(t)|s=0 satisfies

    y(t)=fx(t,x,u)y(t)+fu(t,x,u)v(t), (D.3)

    with the initial condition y(0)=0. Recall from (3.4) that the adjoint state λ(t)=λ(u)(t) satisfies

    λ(t)=gx(t,x,u)λ(t)fx(t,x,u). (D.4)

    Since xs(t) is differentiable with respect to s, the cost j(u+sv) is differentiable with respect to s and it is computed as

    j(u)v=ddsj(u+sv)|s=0=T0gu(t,x(t),u(t))v(t)dt+T0gx(t,x(t),u(t))y(t)dt=T0(gu(t,x(t),u(t))λ(t)fu(t,x(t),u(t)))v(t)dt,

    where we used

    T0gx(t,x(t),u(t))y(t)dt=T0(λ(t)+λ(t)fx(t,x(t),u(t)))y(t)dt=T0λ(t)fu(t,x(t),u(t))v(t)dt,

    due to (D.3), (D.4), y(0)=0, and λ(T)=0.

    Proof of Lemma 3.5. The proof is very similar to Lemma 3.3. We consider jh(u+sv)=J(u+sv,Gh(u+sv)) with vU and sR+. We recall from Lemma C.1 that the function xsh:=Gh(u+sv) is differentiable at s=0 with

    ddsxsh|s=0=yh,

    where yhXrh satisfies the following equation:

    B(yh,φ)=(fx(,xh,u)yh+fu(,xh,u)v,φ)I,φXrh. (D.5)

    Using this, we obtain

    jh(u)v=ddsjh(u+sv)|s=0=T0gu(t,xh(t),u(t))v(t)dt+T0gx(t,xh(t),u(t))yh(t)dt. (D.6)

    We then take ψh=yh in (D.2) to get

    T0gx(t,xh(t),u(t))yh(t)dt=Nn=1(yh,λh)In+(fx(,xh,u)yh,λh)I+N1n=1(yh,n,[λh]n)(yh,N,λk,N).

    On the other hand, by using the integration by parts, we find

    Nn=1(yh,λh)In+N1n=1(yh,n,[λh]n)(yh,N,λh,N)=Nn=1(yh,λh)InNn=2([yh]n1,λ+h,n1)(y+h,0,λ+h,0)=B(wh,λh),

    where B(,) is appeared in (3.6). This yields

    T0gx(t,xh(t),u(t))yh(t)dt=B(yh,λh)+(fx(,xh,u)yh,λh)I=(fu(,xh,u)v,λh)I,

    due to (D.5). This together with (D.6) concludes

    jh(u)v=T0(gu(t,xh(t),u(t))fu(t,xh(t),u(t))λh(t))v(t)dt,

    where vU.

    In this appendix, we provide details of the derivation of the second order derivative of cost functional j and its discrete version jh.

    Lemma E.1. Let j be the cost functional for the optimal control problems (1.1) and (1.2). Then, for uUad and vU, we have

    j(u)(v,v)=T0λ(t)(2f(x)2(t,x(t),u(t))y2(t)+22fxu(t,x(t),u(t))y(t)v(t))dtT0λ(t)2f(u)2(t,x(t),u(t))v2(t)dt+T02g(x)2(t,x(t),u(t))y2(t)dt+T022gxu(t,x,u)y(t)v(t)dt+T02g(u)2(t,x(t),u(t))v2(t)dt.

    Proof. Similarly as in Appendix D, we consider j(u+sv)=J(u+sv,G(u+sv)) with vU and sR+ and set xs(t):=G(u(t)+sv(t)). By Lemmas A.1 and A.2, it follows that

    ddsxs|s=0=yandd2(ds)2xs|s=0=z,

    where yX is given as in (D.3) and zX is the solution to

    z(t)=2f(x)2(t,x(t),u(t))y2(t)+22fxu(t,x(t),u(t))y(t)v(t)+2f(u)2(t,x(t),u(t))v2(t)+fx(t,x(t),u(t))z(t),

    with the initial condition z(0)=0. Then we obtain

    j(u)(v,v)=d2ds2j(u+sv)|s=0=d2ds2T0g(t,xs(t),u(t)+sv(t))dt|s=0=T0gx(t,x(t),u(t))z(t)dt+T02g(x)2(t,x(t),u(t))y2(t)dt+T022gxu(t,x,u)y(t)v(t)dt+T02g(u)2(t,x(t),u(t))v2(t)dt. (E.1)

    On the other hand, we use (D.4) to get

    T0gx(t,x(t),u(t))z(t)dt=T0λ(t)z(t)dt+T0fx(t,x(t),u(t))λ(t)z(t)dt=T0λ(t)z(t)dt+T0fx(t,x(t),u(t))λ(t)z(t)dt=T0λ(t)(2f(x)2(t,x(t),u(t))y2(t)+22fxu(t,x(t),u(t))y(t)v(t))dtT0λ(t)2f(u)2(t,x(t),u(t))v2(t)dt,

    where we used λ(T)=0 and z(0)=0. By combining the above with (E.1), we have

    j(u)(v,v)=T0λ(t)(2f(x)2(t,x(t),u(t))y2(t)+22fxu(t,x(t),u(t))y(t)v(t))dtT0λ(t)2f(u)2(t,x(t),u(t))v2(t)dt+T02g(x)2(t,x(t),u(t))y2(t)dt+T022gxu(t,x,u)y(t)v(t)dt+T02g(u)2(t,x(t),u(t))v2(t)dt. (E.2)

    This completes the proof.

    Remark E.2. Solving the differential Eq (A.1) gives

    y(t)=t0fu(s,x(s),u(s))v(s)exp(tsfx(τ,x(τ),u(τ))dτ)ds,

    and thus

    |y(t)|Ct0|v(s)|ds,

    where C>0 depends only on fL(0,T;W1,) and T>0. This estimate for y enables to bound the first four integrals on the right hand side of (E.2) by

    Cv2L2(I),

    where C>0 depends only on λL(I), T, fL(0,T;W2,), and gL(0,T;W2,). This implies that if g is given by

    g(t,x,u)=˜g(t,x)+γ|u|2,

    then we have

    j(u)(v,v)(2γC)v2L2(I),

    which satisfies (2.4) if γ>C/2. It would be interesting to develop a numerical method to check (2.4) for general case.

    Next we proceed the similar calculation for the approximate solution.

    Lemma E.3. Let jh be the discrete cost functional for the optimal control problems (1.1) and (1.2). Then, for uUad and vU, we have

    jh(u)(v,v)=T0(2f(x)2(t,xh,u)y2h(t)+22fxu(t,xh,u)yh(t)v(t)+2f(u)2(t,xh,u)v2(t))λh(t)dt+T0(2g(x)2(t,xh,u)y2h(t)+22gxu(t,xh,u)yh(t)v(t)+2g(u)2(t,xh,u)v2(t))dt.

    Proof. Similarly as in the proof of Lemma 3.5, we consider jh(u+sv)=J(u+sv,Gh(u+sv)) with vU and sR+ and set xsh:=Gh(u+sv). We recall from Theorem C.1 and Theorem C.3 that

    ddsxsh|s=0=yhandd2(ds)2xsh|s=0=zh,

    where zhXrh satisfies

    B(zh,φ) =T0(2f(x)2(t,xh,u)y2h(t)+22fxu(t,xh,u)yh(t)v(t)+2f(u)2(t,xh,u)v2(t))φ(t)dt +T0fx(t,xh,u)zh(t)φ(t)dt.

    Now a straightforward computation gives

    jh(u)(v,v)=d2ds2T0g(t,xsh(t),u(t)+sv(t))dt|s=0=T0gx(t,xh(t),u(t))zh(t)dt+T02g(x)2(t,xh(t),u(t))y2h(t)dt+T022gxu(t,xh(t),u(t))yh(t)v(t)dt+T02g(u)2(t,xh(t),u(t))v2(t)dt.

    Note that the discrete adjoint state λh(t)=λh(u)(t) satisfies

    B(ψ,λh)+(fx(t,xh,u)λh,ψ)I=(gx(t,xh,u),ψ)I

    for all ψXrh. Thus by considering ψ=zhXrh, we find

    (gx(t,xh,u),zh)I=B(zh,λh)+(fx(t,xh,u)λh,zh)I=T0(2f(x)2(t,xh,u)y2h(t)+22fxu(t,xh,u)yh(t)v(t)+2f(u)2(t,xh,u)v2(t))λh(t)dt.

    Combining the above equalities, we have

    jh(u)(v,v) =T0(2f(x)2(t,xh,u)y2h(t)+22fxu(t,xh,u)yh(t)v(t)+2f(u)2(t,xh,u)v2(t))λh(t)dt +T0(2g(x)2(t,xh,u)y2h(t)+22gxu(t,xh,u)yh(t)v(t)+2g(u)2(t,xh,u)v2(t))dt.

    This completes the proof.



    [1] N. Arada, E. Casas, F. Tröltzsch, Error estimates for the numerical approximation of a semilinear elliptic control problem, Comput. Optim. Appl., 23 (2002), 201–229. https://doi.org/10.1023/A:1020576801966 doi: 10.1023/A:1020576801966
    [2] W. Alt, On the approximation of infinite optimization problems with an application to optimal control problems, Appl. Math. Optim., 12 (1984), 15–27. https://doi.org/10.1007/BF01449031 doi: 10.1007/BF01449031
    [3] W. Alt, U. Felgenhauer, M. Seydenschwanz, Euler discretization for a class of nonlinear optimal control problems with control appearing linearly, Comput. Optim. Appl., 69 (2018), 825–856. https://doi.org/10.1007/s10589-017-9969-7 doi: 10.1007/s10589-017-9969-7
    [4] J. Bonnans, N. Osmolovskiĭ, Second-order analysis of optimal control problems with control and initial-final state constraints, J. Convex Anal., 17 (2010), 885–913.
    [5] J. Bonnans, X. Dupuis, L. Pffiffer, Second-order sufficient conditions for strong solutions to optimal control problems, ESAIM: COCV., 20 (2014), 704–724. https://doi.org/10.1051/cocv/2013080 doi: 10.1051/cocv/2013080
    [6] L. Bonifacius, K. Pieper, Konstantin, B. Vexler, A priori error estimates for space-time finite element discretization of parabolic time-optimal control problems, SIAM J. Control Optim., 57 (2019), 129–162. https://doi.org/10.1137/18M1166948 doi: 10.1137/18M1166948
    [7] M. Baccouch, Analysis of a posteriori error estimates of the discontinuous Galerkin method for nonlinear ordinary differential equations, Appl. Numer. Math., 106 (2016), 129–153. https://doi.org/10.1016/j.apnum.2016.03.008 doi: 10.1016/j.apnum.2016.03.008
    [8] T. Bayen, F. Silva, Second order analysis for strong solutions in the optimal control of parabolic equations, SIAM J. Control Optim., 54 (2016), 819–844. https://doi.org/10.1137/141000415 doi: 10.1137/141000415
    [9] C. Christof, B. Vexler, New regularity results and finite element error estimates for a class of parabolic optimal control problems with pointwise state constraints, ESAIM: COCV., 27 (2021), 4. https://doi.org/10.1051/cocv/2020059 doi: 10.1051/cocv/2020059
    [10] E. Casas, F. Tröltzsch, Second-order optimality conditions for weak and strong local solutions of parabolic optimal control problems, Vietnam J. Math., 44 (2016), 181–202. https://doi.org/10.1007/s10013-015-0175-6 doi: 10.1007/s10013-015-0175-6
    [11] E. Casas, F. Tröltzsch, Second order analysis for optimal control problems: improving results expected from abstract theory, SIAM J. Optim., 22 (2012), 261–279. https://doi.org/10.1137/110840406 doi: 10.1137/110840406
    [12] K. Chrysafinos, Convergence of discontinuous Galerkin approximations of an optimal control problem associated to semilinear parabolic PDE's, ESAIM Math. Model. Num., 44 (2010), 189–206. https://doi.org/10.1051/m2an/2009046 doi: 10.1051/m2an/2009046
    [13] A. L. Dontchev, W. W. Hager, Lipschitzian stability in nonlinear control and optimization, SIAM J. Control Optim., 31 (1993), 569–603. https://doi.org/10.1137/0331026 doi: 10.1137/0331026
    [14] A. L. Dontchev, W. W. Hager, The Euler approximation in state constrained optimal control, Math. Comp., 70 (2000), 173–203. https://doi.org/10.1090/S0025-5718-00-01184-4 doi: 10.1090/S0025-5718-00-01184-4
    [15] A. L. Dontchev, W. W. Hager, V. M. Veliov, Second-order Runge-Kutta approximations in control constrained optimal control, SIAM J. Numer. Anal., 38 (2000), 202–226. https://doi.org/10.1137/S0036142999351765 doi: 10.1137/S0036142999351765
    [16] A. L. Dontchev, M. I. Krastanov, I. V. Kolmanovsky, M. M. Nicotra, V. M. Veliov, Lipschitz Stability in Discretized Optimal Control with Application to SQP, SIAM J. Control Optim., 57 (2019), 468–489. https://doi.org/10.1137/18M1188483 doi: 10.1137/18M1188483
    [17] A. L. Dontchev, I. V. Kolmanovsky, M. I. Krastanov, V. M. Veliov, P. T. Vuong, Approximating optimal finite horizon feedback by model predictive control, Syst. Control Lett., 139 (2020), 104666. https://doi.org/10.1016/j.sysconle.2020.104666 doi: 10.1016/j.sysconle.2020.104666
    [18] M. Delfour, W. W. Hager, F. Trochu, Discontinuous Galerkin methods for ordinary differential equations, Math. Comp., 36 (1981), 455–473. https://doi.org/10.1090/S0025-5718-1981-0606506-0 doi: 10.1090/S0025-5718-1981-0606506-0
    [19] D. Estep, A posteriori error bounds and global error control for approximation of ordinary differential equations, SIAM J. Numer. Anal., 32 (1995), 1–48. https://doi.org/10.1137/0732001 doi: 10.1137/0732001
    [20] G. Elnagar, M. A. Kazemi, M. Razzaghi, The pseudospectral Legendre method for discretizing optimal control problems, IEEE T. Automat. Contr., 40 (1995), 1793–1796. https://doi.org/10.1109/9.467672 doi: 10.1109/9.467672
    [21] U. Felgenhauer, On stability of bang-bang type controls, SIAM J. Control Optim., 41 (2003), 1843–1867. https://doi.org/10.1137/S0363012901399271 doi: 10.1137/S0363012901399271
    [22] C. Glusa, E. Otárola, Error estimates for the optimal control of a parabolic fractional PDE, SIAM J. Numer. Anal., 59 (2021), 1140–1165. https://doi.org/10.1137/19M1267581 doi: 10.1137/19M1267581
    [23] D. Hafemeyer, F. Mannel, I. Neitzel, B. Vexler, Finite element error estimates for one-dimensional elliptic optimal control by BV-functions, Math. Control Relat. F., 10 (2020), 333–363. https://doi.org/10.3934/mcrf.2019041 doi: 10.3934/mcrf.2019041
    [24] J. Henriques, J. Lemos, J. Eça, L. Gato, A. Falcão, A high-order discontinuous Galerkin method with mesh refinement for optimal control, Automatica, 85 (2017), 70–82. https://doi.org/10.1016/j.automatica.2017.07.029 doi: 10.1016/j.automatica.2017.07.029
    [25] D. Meidner, B. Vexler, A priori error estimates for space-time finite element discretization of parabolic optimal control problems. Part I: Problems without control constraints, SIAM J. Control Optim., 47 (2008), 1150–1177. https://doi.org/10.1137/070694016 doi: 10.1137/070694016
    [26] D. Meidner, B. Vexler, A priori error estimates for space-time finite element discretization of parabolic optimal control problems. Part II: problems with control constraints, SIAM J. Control Optim., 47 (2008), 1301–1329. https://doi.org/10.1137/070694028 doi: 10.1137/070694028
    [27] D. Meidner, B. Vexler, Optimal error estimates for fully discrete Galerkin approximations of semilinear parabolic equations, ESAIM Math. Model. Num., 52 (2018), 2307–2325. https://doi.org/10.1051/m2an/2018040 doi: 10.1051/m2an/2018040
    [28] R. Manohar, R. K. Sinha, Space-time a posteriori error analysis of finite element approximation for parabolic optimal control problems: A reconstruction approach, Optim. Contr. Appl. Met., 41 (2020), 1543–1567. https://doi.org/10.1002/oca.2618 doi: 10.1002/oca.2618
    [29] I. Neitzel, B. Vexler, A priori error estimates for space-time finite element discretization of semilinear parabolic optimal control problems, Numer. Math., 120 (2012), 345–386. https://doi.org/10.1007/s00211-011-0409-9 doi: 10.1007/s00211-011-0409-9
    [30] E. Otárola, An adaptive finite element method for the sparse optimal control of fractional diffusion, Numer. Meth. Part. D. E., 36 (2020), 302–328. https://doi.org/10.1002/num.22429 doi: 10.1002/num.22429
    [31] N. P. Osmolovskii, H. Maurer, Equivalence of second order optimality conditions for bang-bang control problems. Part 1: Main results, Control Cybern., 34 (2005), 927–950.
    [32] N. P. Osmolovskii and H. Maurer, Equivalence of second order optimality conditions for bang-bang control problems. Part 2: Proofs, variational derivatives and representations, Control Cybern., 36 (2007), 5–45.
    [33] I. M. Ross, M. Karpenko, A review of pseudospectral optimal control: From theory to flight, Annu. Rev. Control, 36 (2012), 182–197. https://doi.org/10.1016/j.arcontrol.2012.09.002 doi: 10.1016/j.arcontrol.2012.09.002
    [34] D. Schötzau, C. Schwab, An hp a priori error analysis of the DG time-stepping method for initial value problems, Calcolo, 37 (2000), 207–232. https://doi.org/10.1007/s100920070002 doi: 10.1007/s100920070002
    [35] B. Vexler, Finite element approximation of elliptic Dirichlet optimal control problems, Numer. Func. Anal. Opt., 28 (2007), 957–973. https://doi.org/10.1080/01630560701493305 doi: 10.1080/01630560701493305
    [36] J. Vlassenbroeck, R. Van Dooren, A Chebyshev technique for solving nonlinear optimal control problems, IEEE T. Automat. Contr., 33 (1988), 333–349. https://doi.org/10.1109/9.192187 doi: 10.1109/9.192187
  • This article has been cited by:

    1. Zifu Fan, Youpeng Tao, Wei Zhang, Kexin Fan, Jiaojiao Cheng, Research on open and shared data from government-enterprise cooperation based on a stochastic differential game, 2023, 8, 2473-6988, 4726, 10.3934/math.2023234
  • Reader Comments
  • © 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1916) PDF downloads(95) Cited by(1)

Figures and Tables

Tables(3)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog