A two-dimensional diffusion process is controlled until it enters a given subset of R2. The aim is to find the control that minimizes the expected value of a cost function in which there are no control costs. The optimal control can be expressed in terms of the value function, which gives the smallest value that the expected cost can take. To obtain the value function, one can make use of dynamic programming to find the differential equation it satisfies. This differential equation is a non-linear second-order partial differential equation. We find explicit solutions to this non-linear equation, subject to the appropriate boundary conditions, in important particular cases. The method of similarity solutions is used.
Citation: Mario Lefebvre. An optimal control problem without control costs[J]. Mathematical Biosciences and Engineering, 2023, 20(3): 5159-5168. doi: 10.3934/mbe.2023239
[1] | Heping Ma, Hui Jian, Yu Shi . A sufficient maximum principle for backward stochastic systems with mixed delays. Mathematical Biosciences and Engineering, 2023, 20(12): 21211-21228. doi: 10.3934/mbe.2023938 |
[2] | Dan Zhu, Qinfang Qian . Optimal switching time control of the hyperbaric oxygen therapy for a chronic wound. Mathematical Biosciences and Engineering, 2019, 16(6): 8290-8308. doi: 10.3934/mbe.2019419 |
[3] | Alessia Civallero, Cristina Zucca . The Inverse First Passage time method for a two dimensional Ornstein Uhlenbeck process with neuronal application. Mathematical Biosciences and Engineering, 2019, 16(6): 8162-8178. doi: 10.3934/mbe.2019412 |
[4] | H. J. Alsakaji, F. A. Rihan, K. Udhayakumar, F. El Ktaibi . Stochastic tumor-immune interaction model with external treatments and time delays: An optimal control problem. Mathematical Biosciences and Engineering, 2023, 20(11): 19270-19299. doi: 10.3934/mbe.2023852 |
[5] | Giuseppe D'Onofrio, Enrica Pirozzi . Successive spike times predicted by a stochastic neuronal model with a variable input signal. Mathematical Biosciences and Engineering, 2016, 13(3): 495-507. doi: 10.3934/mbe.2016003 |
[6] | Minna Shao, Hongyong Zhao . Dynamics and optimal control of a stochastic Zika virus model with spatial diffusion. Mathematical Biosciences and Engineering, 2023, 20(9): 17520-17553. doi: 10.3934/mbe.2023778 |
[7] | Xiaoxuan Pei, Kewen Li, Yongming Li . A survey of adaptive optimal control theory. Mathematical Biosciences and Engineering, 2022, 19(12): 12058-12072. doi: 10.3934/mbe.2022561 |
[8] | Laurenz Göllmann, Helmut Maurer . Optimal control problems with time delays: Two case studies in biomedicine. Mathematical Biosciences and Engineering, 2018, 15(5): 1137-1154. doi: 10.3934/mbe.2018051 |
[9] | Miniak-Górecka Alicja, Nowakowski Andrzej . Sufficient optimality conditions for a class of epidemic problems with control on the boundary. Mathematical Biosciences and Engineering, 2017, 14(1): 263-275. doi: 10.3934/mbe.2017017 |
[10] | Erin N. Bodine, Louis J. Gross, Suzanne Lenhart . Optimal control applied to a model for species augmentation. Mathematical Biosciences and Engineering, 2008, 5(4): 669-680. doi: 10.3934/mbe.2008.5.669 |
A two-dimensional diffusion process is controlled until it enters a given subset of R2. The aim is to find the control that minimizes the expected value of a cost function in which there are no control costs. The optimal control can be expressed in terms of the value function, which gives the smallest value that the expected cost can take. To obtain the value function, one can make use of dynamic programming to find the differential equation it satisfies. This differential equation is a non-linear second-order partial differential equation. We find explicit solutions to this non-linear equation, subject to the appropriate boundary conditions, in important particular cases. The method of similarity solutions is used.
We consider a two-dimensional controlled diffusion process (X1(t),X2(t)) defined by the following system of stochastic differential equations:
dX1(t)=f1[X1(t)]dt+b1[X1(t)]u2(t)dt+{v1[X1(t)]}1/2dB1(t), | (1.1) |
dX2(t)=f2[X2(t)]dt+b2[X2(t)]u(t)dt+{v2[X2(t)]}1/2dB2(t), | (1.2) |
where fi(⋅) is a real function, bi(⋅)≠0, u(t) is the control variable, vi(⋅)>0 and {Bi(t),t≥0} is a standard Brownian motion, for i=1,2. The two Brownian motions are assumed to be independent. The functions fi(⋅) and vi(⋅) are respectively the infinitesimal mean and variance of the uncontrolled process, for i=1,2. The functions b1(⋅) and b2(⋅) are control coefficients or parameters.
Let
T(x1,x2)=inf{t>0:(X1(t),X2(t))∈D∣(X1(0)=x1,X2(0)=x2)∉D}, | (1.3) |
where D is a subset of R2. The random variable T is called a first-passage time in probability theory. The aim is to minimize the expected value of the cost function
J(x1,x2)=∫T(x1,x2)0{q[X1(t),X2(t)]+λ}dt+K[X1(T),X2(T)], | (1.4) |
where q(⋅,⋅)≥0, λ is a real constant and K(⋅,⋅) is a general termination cost function. This type of stochastic optimal control problem is known as a homing problem; see Whittle [1, p. 289] or Whittle [2, p. 222]. Notice however that there are no control costs. Therefore, the above problem is actually an extension of the classic homing problem. Moreover, we see in Eqs (1.1) and (1.2) that the control variable does not have the same effect on each component of the two-dimensional diffusion process (X1(t),X2(t)). Notice also that the problem is time-invariant, because the functions fi(⋅),bi(⋅) and vi(⋅), for i=1,2, as well as q(⋅,⋅) and K(⋅,⋅) do not depend explicitly on t.
Recent papers on homing problems include the following ones: Kounta and Dawson [3], Makasu [4] and Lefebvre [5]. The original homing problem has been extended in various ways: Lefebvre and Kounta [6] replaced the diffusion processes by discrete-time Markov chains, Lefebvre and Moutassim [7] considered the problem for jump-diffusion processes, and Lefebvre [8] treated the case of controlled autoregressive processes.
There are some papers on optimization problems for which the final time is random. However, this final time is not a first-passage time, as in homing problems. Such problems were considered, in particular, in Yan and Koo [9], Rodosthenous and Zhang [10], Yun and Choi [11], Khatab {et al.} [12] and Yu [13].
Homing problems are sometimes expressed as dynamical games; see Lefebvre [14]. It is possible to find papers on differential games with a random time horizon; see, for instance, Marín-Solano and Shevkoplyas [15] and Zaremba {et al.} [16]. However, in these papers, the final time is again not a first-passage time.
Next, we define the value function by
F(x1,x2)=infu(t)0≤t≤T(x1,x2)E[J(x1,x2)]. | (1.5) |
That is, F(x1,x2) is the expected cost obtained by using the optimal control in the interval [0,T]. In Section 2, we will make use of dynamic programming to find the differential equation it satisfies. This differential equation is a non-linear second-order partial differential equation (PDE). We will see that the optimal control u∗ can be expressed in terms of the value function as follows:
u∗=−b2(x2)2b1(x1)Fx2(x1,x2)Fx1(x1,x2). | (1.6) |
In Section 3, we will find explicit solutions to the non-linear PDE satisfied by the value function, subject to the appropriate boundary conditions, in important particular cases. The method of similarity solutions will be used. Finally, some final remarks will be made in Section 4.
Bellman's principle of optimality states that "an optimal policy has the property that, whatever the initial state and the initial decision, it must constitute an optimal policy with regards to the state resulting from the first decision". Hence, any remaining part of an optimal policy is also optimal. Therefore, we can write that
F(x1,x2)=infu(t)0≤t≤ΔtE[∫Δt0{q[X1(t),X2(t)]+λ}dt+F(x1+[f1(x1)+b1(x1)u2(0)]Δt+v1/21(x1)B1(Δt),x2+[f2(x2)+b2(x2)u(0)]Δt+v1/22(x2)B2(Δt))+o(Δt)]. | (2.1) |
We have
∫Δt0{q[X1(t),X2(t)]+λ}dt≃[q(x1,x2)+λ]Δt. | (2.2) |
Moreover, a standard Brownian motion {B(t),t≥0} is such that
E[B(Δt)]=0andE[B2(Δt)]=Var[B(Δt)]=Δt. | (2.3) |
It follows, assuming that F(x1,x2) is twice differentiable with respect to x1 and x2 and making use of Taylor's formula, that
F(x1,x2)=infu(t)0≤t≤Δt{[q(x1,x2)+λ]Δt+F(x1,x2)+[f1(x1)+b1(x1)u2(0)]ΔtFx1+12v1(x1)ΔtFx1,x1+[f2(x2)+b2(x2)u(0)]ΔtFx2+12v2(x2)ΔtFx2,x2+o(Δt)}. | (2.4) |
Finally, dividing each side of the previous equation by Δt and letting Δt decrease to zero, we obtain the following dynamic programming equation:
0=infu(0){q(x1,x2)+λ+[f1(x1)+b1(x1)u2(0)]Fx1+12v1(x1)Fx1,x1+[f2(x2)+b2(x2)u(0)]Fx2+12v2(x2)Fx2,x2}. | (2.5) |
Differentiating Eq (2.5) with respect to u(0), we find, as mentioned in the Introduction section, that the optimal control is
u∗(0)=−b2(x2)2b1(x1)Fx2(x1,x2)Fx1(x1,x2). | (2.6) |
Then, substituting the above expression into Eq (2.5), we can state the following proposition.
Proposition 2.1. The value function F(x1,x2) satisfies the second-order, non-linear PDE
0=q(x1,x2)+λ−b22(x2)4b1(x1)F2x2Fx1+2∑i=1{fi(xi)Fxi+12vi(xi)Fxixi}, | (2.7) |
subject to the boundary condition
F(x1,x2)=K(x1,x2)if(x1,x2)∈D. | (2.8) |
In the next section, explicit solutions to (2.7), (2.8) will be obtained in important particular cases. The method of similarity solutions will be used.
{Case I}. The first particular case that we consider is the one for which fi(⋅)≡0, bi(⋅)≡1, vi(⋅)≡1, for i=1,2, q(⋅,⋅)≡0, λ=1, K(⋅,⋅)≡0 and we choose the first-passage time
T1(x1,x2)=inf{t>0:X1(t)−X2(t)=k1ork2∣k1<x1−x2<k2}, | (3.1) |
where xi=Xi(0) for i=1,2. The diffusion process (X1(t),X2(t)) is then defined by the stochastic differential equations
dX1(t)=u2(t)dt+dB1(t), | (3.2) |
dX2(t)=u(t)dt+dB2(t). | (3.3) |
Thus, (X1(t),X2(t)) is a controlled two-dimensional standard Brownian motion. This case is arguably the simplest non-degenerate two-dimensional problem that can be examined. Equation (2.7) reduces to
0=1−14F2x2Fx1+12Fx1x1+12Fx2x2, | (3.4) |
subject to the boundary conditions
F(x1,x2)=0if x1−x2=k1or k2. | (3.5) |
To solve (3.4), (3.5), we will make use of the method of similarity solutions. We look for a solution of the form
F(x1,x2)=H(w), | (3.6) |
where w:=x1−x2 is the similarity variable. For the method to work, we must be able to express both the Eq (3.4) and the boundary conditions (3.5) in terms of w. We find that Eq (3.4) is transformed into the second-order linear ordinary differential equation
0=1−14H′(w)+H″(w), | (3.7) |
while the boundary conditions become
H(k1)=H(k2)=0. | (3.8) |
The general solution of Eq (3.7) can be expressed as follows:
H(w)=c1ew/4+4w+c2. | (3.9) |
The particular solution that satisfies the boundary conditions (3.8) is
H(w)=4w+4k1ek2/4−k2ek1/4−(k1−k2)ew/4ek1/4−ek2/4 | (3.10) |
for k1≤w≤k2. Let us choose k1=0 and k2=1. Then, the above solution reduces to
H(w)=4w+4ew/4−1e1/4−1for 0≤w≤1. | (3.11) |
It follows that the value function F(x1,x2) is given by
F(x1,x2)=4(x1−x2)+4e(x1−x2)/4−1e1/4−1 | (3.12) |
for (x1,x2)∈R2 such that 0≤x1−x2≤1.
Next, we deduce from Eq (2.6) and the fact that Fx1=H′(w)=−Fx2 that the optimal control in this particular problem is actually a constant:
u∗(0)≡12. | (3.13) |
Hence, the optimally controlled diffusion process satisfies
dX∗1(t)=14dt+dB1(t), | (3.14) |
dX∗2(t)=12dt+dB2(t). | (3.15) |
That is, {X∗1(t),t≥0} (respectively {X∗2(t),t≥0}) is a Wiener process with drift parameter 1/4 (resp. 1/2) and variance parameter 1. Since the two processes are independent, we can state that the one-dimensional process {X∗(t),t≥0} defined by
X∗(t)=X∗1(t)−X∗2(t)for t≥0 | (3.16) |
is a Wiener process with drift parameter μ=−1/4 and variance parameter σ2=2.
Remarks. (i) With the choices q(⋅,⋅)≡0, λ=1 and K(⋅,⋅)≡0 that we made above, the cost function J(x1,x2) defined in Eq (1.4) reduces to T1(x1,x2). Therefore, the aim is to make the two-dimensional controlled process leave the continuation region as soon as possible. Even though there are no control costs, we saw that the optimal solution consists in choosing a (finite) constant control.
(ii) Let T∗1(x1,x2) be the first-passage time when we use the optimal control. We may write that F(x1,x2)=E[T∗1(x1,x2)]. The function m(w):=E[T∗1(w=x1−x2)] satisfies the second-order linear ordinary differential equation
m″(w)−14m′(w)=−1, | (3.17) |
subject to the boundary conditions m(0)=m(1)=0. We then deduce from Eqs (3.7) and (3.8) (with k1=0 and k2=1) that the functions H(w) and m(w) are the same.
Case II. Assume now that fi(⋅)≡0, bi[Xi(t)]=Xi(t), vi[Xi(t)]=X2i(t), for i=1,2, q(⋅,⋅)≡0, λ=1 and K(⋅,⋅)≡0. Moreover, we define
T2(x1,x2)=inf{t>0:X21(t)X22(t)=k1ork2|k1<x21x22<k2}, | (3.18) |
where k1>0. The controlled diffusion process (X1(t),X2(t)) is such that
dX1(t)=X1(t)u2(t)dt+X1(t)dB1(t), | (3.19) |
dX2(t)=X2(t)u(t)dt+X2(t)dB2(t). | (3.20) |
This time, (X1(t),X2(t)) is a controlled two-dimensional geometric Brownian motion. A geometric Brownian motion {Y(t),t≥0} can be expressed as the exponential of a Wiener process. Therefore, if we assume that Y(0)>0, then we can state that Y(t)>0 for any t≥0.
Equation (2.7) takes the form
0=1−x224x1F2x2Fx1+12x21Fx1x1+12x22Fx2x2, | (3.21) |
and is subject to the boundary conditions
F(x1,x2)=0if x21/x22=k1ork2. | (3.22) |
Based on the boundary conditions, we now look for a solution of the form F(x1,x2)=H(w=x21/x22). We have
Fx1=H′(w)(2x1/x22),Fx2=H′(w)(−2x21/x32), | (3.23) |
Fx1x1=H″(w)(2x1/x22)2+H′(w)(2/x22) | (3.24) |
and
Fx2x2=H″(w)(−2x21/x32)2+H′(w)(6x21/x42). | (3.25) |
Substituting these expressions into Eq (3.21), we find that it becomes
0=1+72wH′(w)+4w2H″(w). | (3.26) |
The boundary conditions are simply H(k1)=H(k2)=0, as in Case I.
The general solution of Eq (3.26) is
H(w)=c1w1/8+2ln(w)+c2. | (3.27) |
With k1=1 and k2=2, we find that
H(w)=2ln(2)21/8−1(1−w1/8)+2ln(w)for 1≤w≤2. | (3.28) |
Finally, from the expressions in Eq (3.23), we calculate
u∗(0)=−x22x1(−2x21/x32)(2x1/x22)≡12. | (3.29) |
Thus, the optimal control is again a constant. It follows that
dX∗1(t)=14X∗1(t)dt+X∗1(t)dB1(t), | (3.30) |
dX∗2(t)=12X∗2(t)dt+X∗2(t)dB2(t). | (3.31) |
The optimally controlled process {X∗i(t),t≥0} is also a geometric Brownian motion, for i=1,2. We can write that X∗1(t)=eZ1(t), where {Z1(t),t≥0} is a Wiener process with drift parameter −1/4 and variance parameter 1. Similarly, X∗2(t)=eZ2(t), where {Z2(t),t≥0} is a Wiener process with drift parameter 0 and variance parameter 1. Hence, by independence,
W(t):=[X∗1(t)]2[X∗2(t)]2=eZ(t), | (3.32) |
where {Z(t),t≥0} is a Wiener process with drift parameter −1/2 and variance parameter 8. The infinitesimal parameters of {W(t),t≥0} are given by 7w/2 and 8w2. Therefore, we may write that the function m(w):=E[T∗2(w=x21/x22)] satisfies the second-order linear ordinary differential equation
4m″(w)+72m′(w)=−1, | (3.33) |
subject to m(1)=m(2)=0, from which we may conclude that the functions m(w) and H(w) coincide, as required.
Case III. To conclude this section, we will present a case when the optimal control is not a constant. Assume, in Case II, that b1[X1(t)]=X21(t), b2[X2(t)]=X3/22(t), λ=0 and K(X1(T2),X2(T2))=X21(T2)/X22(T2). Hence, there is only a termination cost. The aim is now to make the controlled process (X1(t),X2(t)) leave the continuation region through a given part of its boundary. Indeed, the optimizer must try to make X21(t)/X22(t) take on the value k1 before k2 (>k1).
We find that Eq (3.26) becomes
0=(−12w1/2+4w)H′(w)+4w2H″(w), | (3.34) |
subject to H(ki)=ki, for i=1,2. The general solution of the above equation is
H(w)=c1+c2Ei1(14√w), | (3.35) |
where Ei1 is an exponential integral function defined by
Ei1(z)=∫∞1e−vzv−1dv. | (3.36) |
The particular solution that satisfies the boundary conditions H(1)=1 and H(2)=2 is
H(w)=−Ei1(14√w)+2Ei1(14)−Ei1(√28)Ei1(14)−Ei1(√28)for 1≤w≤2. | (3.37) |
We can now calculate the optimal control. We find that
u∗(0)=√x22x1. | (3.38) |
We notice that not only the optimal control is not a constant, it is not a function of w:=x21/x22 either. The optimally controlled process (X∗1(t),X∗2(t)) satisfies the following stochastic differential equations:
dX∗1(t)=14X∗2(t)dt+X∗1(t)dB1(t), | (3.39) |
dX∗2(t)=[X∗2(t)]22X∗1(t)dt+X∗2(t)dB2(t). | (3.40) |
Remark. Another case for which the optimal control is not a constant is the one when we replace b1[X1(t)] by 1 and b2[X2(t)] by √X2(t) in Case III. This time, the value function is
F(x1,x2)=Ei1(−x14x2)+Ei1(−√24)−2Ei1(−14)Ei1(−√24)−Ei1(−14) | (3.41) |
for x1>0 and x2>0 such that 1≤x21/x22≤2. Finally, the optimal control is given by
u∗(0)=x12√x2. | (3.42) |
In this paper, a stochastic optimal control problem for a two-dimensional diffusion process (X1(t),X2(t)) has been considered. This problem is an extension of the so-called homing problems, in which the final time, rather than being either a fixed constant or infinity, is a random variable. The optimizer stops controlling the processes the first time a certain event occurs. Here, the cost function was modified: there were no control costs. However, the control variable u(t) was assumed to influence each part of the controlled process differently; namely, the state dynamics are quadratic in u(t) for X1(t), while they are linear in the case of X2(t).
In Section 2, we gave the PDE satisfied by the value function in the general case. Then, in Section 3, we presented various particular cases for which we were able to obtain explicit and exact solutions to the problems considered. The method of similarity solutions was used to solve the appropriate equations. Although there are no control costs, the optimal control was never either identical to zero or infinite.
When the method of similarity solutions fails, we could of course at least try to obtain numerical solutions to any particular problem. However, the aim of this paper was to present exact analytical solutions to important problems.
This research was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC). The author also wishes to thank the anonymous reviewers of this paper for their constructive comments.
The author reports that there are no competing interests to declare.
[1] | P. Whittle, Optimization Over Time, Vol. I, Wiley, Chichester, 1982. |
[2] | P. Whittle, Risk-Sensitive Optimal Control, Wiley, Chichester, 1990. |
[3] |
M. Kounta, N. J. Dawson, Linear quadratic Gaussian homing for Markov processes with regime switching and applications to controlled population growth/decay, Methodol. Comput. Appl. Probab., 23 (2021), 1155–1172. https://doi.org/10.1007/s11009-020-09800-2 doi: 10.1007/s11009-020-09800-2
![]() |
[4] |
C. Makasu, Homing problems with control in the diffusion coefficient, IEEE Trans. Autom. Control, 67 (2022), 3770–3772. https://doi.org/10.1109/TAC.2022.3157077 doi: 10.1109/TAC.2022.3157077
![]() |
[5] |
M. Lefebvre, Minimizing or maximizing the first-passage time to a time-dependent boundary, Optimization, 71 (2022), 387–401. https://doi.org/10.1080/02331934.2021.1914039 doi: 10.1080/02331934.2021.1914039
![]() |
[6] |
M. Lefebvre, M. Kounta, Discrete homing problems, Arch. Control Sci., 23 (2013), 5–18. https://doi.org/10.2478/v10170-011-0039-6 doi: 10.2478/v10170-011-0039-6
![]() |
[7] |
M. Lefebvre, A. Moutassim, Exact solutions to the homing problem for a Wiener process with jumps, Optimization, 70 (2021), 307–319. https://doi.org/10.1080/02331934.2019.1711084 doi: 10.1080/02331934.2019.1711084
![]() |
[8] |
M. Lefebvre, The homing problem for autoregressive processes, IMA J. Math. Control Inf., 39 (2022), 322–344. https://doi.org/10.1093/imamci/dnab047 doi: 10.1093/imamci/dnab047
![]() |
[9] |
Z. Yang, H. K. Koo, Optimal consumption and portfolio selection with early retirement option, Math. Oper. Res., 43 (2018), 1378–1404. https://doi.org/10.1287/moor.2017.0909 doi: 10.1287/moor.2017.0909
![]() |
[10] |
N. Rodosthenous, H. Zhang, Beating the omega clock: an optimal stopping problem with random time-horizon under spectrally negative Lévy models, Ann. Appl. Probab., 28 (2018), 2105–2140. https://doi.org/10.1214/17-AAP1322 doi: 10.1214/17-AAP1322
![]() |
[11] |
W. Y. Yun, C. H. Choi, Optimum replacement intervals with random time horizon, J. Qual. Maint. Eng., 6 (2000), 269–274. https://doi.org/10.1108/13552510010346798 doi: 10.1108/13552510010346798
![]() |
[12] |
A. Khatab, N. Rezg, D. Ait-Kadi, Optimum block replacement policy over a random time horizon, J. Intell. Manuf., 22 (2011), 885–889. https://doi.org/10.1007/s10845-009-0364-9 doi: 10.1007/s10845-009-0364-9
![]() |
[13] | Z. Yu, Continuous-time mean-variance portfolio selection with random horizon, Appl. Math. Optim., 68 (2013). https://doi.org/10.1007/S00245-013-9209-1 |
[14] |
M. Lefebvre, A stochastic model for computer virus propagation, J. Dyn. Games, 7 (2020), 163–174. https://doi.org/10.3934/jdg.2020010 doi: 10.3934/jdg.2020010
![]() |
[15] |
J. Marín-Solano, E. V. Shevkoplyas, Non-constant discounting and differential games with random time horizon, Automatica, 47 (2011), 2626–2638. https://doi.org/10.1016/j.automatica.2011.09.010 doi: 10.1016/j.automatica.2011.09.010
![]() |
[16] | A. Zaremba, E. Gromova, A. Tur, A differential game with random time horizon and discontinuous distribution, Mathematics, 8 (2020). https://doi.org/10.3390/math8122185 |