Retraction

Retraction notice to "Identifications of the coefficients of the Taylor expansion (second order) of periodic non-collision solutions for the perturbed planar Keplerian Hamiltonian system " [AIMS Mathematics 8(7) (2023) 16528–16541]

  • Received: 13 July 2023 Accepted: 14 July 2023 Published: 17 July 2023
  • Citation: Riadh Chteoui. Retraction notice to 'Identifications of the coefficients of the Taylor expansion (second order) of periodic non-collision solutions for the perturbed planar Keplerian Hamiltonian system ' [AIMS Mathematics 8(7) (2023) 16528–16541][J]. AIMS Mathematics, 2023, 8(10): 22730-22730. doi: 10.3934/math.20231157

    Related Papers:

    [1] Xiaoming Wang, Rizwan Rizwan, Jung Rey Lee, Akbar Zada, Syed Omar Shah . Existence, uniqueness and Ulam's stabilities for a class of implicit impulsive Langevin equation with Hilfer fractional derivatives. AIMS Mathematics, 2021, 6(5): 4915-4929. doi: 10.3934/math.2021288
    [2] Abdelatif Boutiara, Mohammed S. Abdo, Manar A. Alqudah, Thabet Abdeljawad . On a class of Langevin equations in the frame of Caputo function-dependent-kernel fractional derivatives with antiperiodic boundary conditions. AIMS Mathematics, 2021, 6(6): 5518-5534. doi: 10.3934/math.2021327
    [3] Rizwan Rizwan, Jung Rye Lee, Choonkil Park, Akbar Zada . Switched coupled system of nonlinear impulsive Langevin equations with mixed derivatives. AIMS Mathematics, 2021, 6(12): 13092-13118. doi: 10.3934/math.2021757
    [4] Thabet Abdeljawad, Sabri T. M. Thabet, Imed Kedim, Miguel Vivas-Cortez . On a new structure of multi-term Hilfer fractional impulsive neutral Levin-Nohel integrodifferential system with variable time delay. AIMS Mathematics, 2024, 9(3): 7372-7395. doi: 10.3934/math.2024357
    [5] Arjumand Seemab, Mujeeb ur Rehman, Jehad Alzabut, Yassine Adjabi, Mohammed S. Abdo . Langevin equation with nonlocal boundary conditions involving a ψ-Caputo fractional operators of different orders. AIMS Mathematics, 2021, 6(7): 6749-6780. doi: 10.3934/math.2021397
    [6] Weerawat Sudsutad, Chatthai Thaiprayoon, Sotiris K. Ntouyas . Existence and stability results for ψ-Hilfer fractional integro-differential equation with mixed nonlocal boundary conditions. AIMS Mathematics, 2021, 6(4): 4119-4141. doi: 10.3934/math.2021244
    [7] Hasanen A. Hammad, Hassen Aydi, Hüseyin Işık, Manuel De la Sen . Existence and stability results for a coupled system of impulsive fractional differential equations with Hadamard fractional derivatives. AIMS Mathematics, 2023, 8(3): 6913-6941. doi: 10.3934/math.2023350
    [8] J. Vanterler da C. Sousa, E. Capelas de Oliveira, F. G. Rodrigues . Ulam-Hyers stabilities of fractional functional differential equations. AIMS Mathematics, 2020, 5(2): 1346-1358. doi: 10.3934/math.2020092
    [9] Thanin Sitthiwirattham, Rozi Gul, Kamal Shah, Ibrahim Mahariq, Jarunee Soontharanon, Khursheed J. Ansari . Study of implicit-impulsive differential equations involving Caputo-Fabrizio fractional derivative. AIMS Mathematics, 2022, 7(3): 4017-4037. doi: 10.3934/math.2022222
    [10] Weerawat Sudsutad, Wicharn Lewkeeratiyutkul, Chatthai Thaiprayoon, Jutarat Kongson . Existence and stability results for impulsive (k,ψ)-Hilfer fractional double integro-differential equation with mixed nonlocal conditions. AIMS Mathematics, 2023, 8(9): 20437-20476. doi: 10.3934/math.20231042


  • Stochastic differential games (SDGs) are a sophisticated and rewarding branch of game theory. In SDG problem, decisions are made in interactive environments, and the players of the game try to find optimal policies and balance the trade-off with their opponents. A key feature of SDGs is the use of stochastic differential equations with control variables to define the state dynamics of the system. For example, see the work of [7,30] and the references therein. In most cases, the parameters of the controlled system in previous SDG problems are assumed to be constants or the function of the controlled system itself. Since the empirical study has witnessed the abrupt change in the return of the financial market (c.f. [11]), it is natural to a construct controlled system which can capture the effects of structural shifts in macroeconomic conditions and business cycles on price dynamics. One typical stochastic system with regime switching has its roots in early work by [24]. This inspires us to study the SDG problem in a system with regime switching. [4] provided a presentative work on investigating nonzero sum SDG problem within the jump diffusion model with regime switching.

    Fund managers are often incentive to invest in high-volatility, risky assets in pursuit of higher returns or to outperform market benchmarks, commonly known as "beating the market". Such incentives can elevate the risk of future losses for investors, making these aggressive strategies unpopular with shareholders and detrimental to the stable development of the financial market. Consequently, both shareholders and regulators must closely monitor the investment behavior of financial institutions.

    Among the most critical regulatory aspects is solvency, the 2008 financial crisis underscored significant gaps in capital and risk management within financial institutions. In response, global regulatory bodies implemented comprehensive reforms to enhance solvency standards and safeguard against systemic risk. One intriguing question arises: if solvency regulations were applied to players in a SDG, what changes would ensue, and how could these changes be quantified? How might one model these "regulations" in a meaningful way? These considerations prompt us to investigate SDG within a regime-switching model. Unlike the work of [4], this paper exclusively examines nonzero sum SDG with regime switching.

    In past two decades, SDGs have garnered increasing interest in finance and actuarial science. For example, [7] studied investment games within the Black-Scholes model, [4] extended this work to a jump diffusion model. [17] explored SDGs with relative performance metrics and control constraints, and [2] examined SDGs for fund managers. In actuarial research, [49] and [42] investigated nonzero sum SDGs between insurance and reinsurance companies, [23] analyzed SDGs between two defined contribution pension plans, [36] studied robust SDGs under model uncertainty, and [10] explored optimal SDGs using the mean-variance premium principle. The common feature of the models used in these works is that the parameters of the controlled system are constants. Recently, more sophisticated model were used in the SDGs (c.f. [31,43]), or more potential risks were incorporated in the system, such as default risk or asymmetry information for the players (c.f. [13,51]). Compared to previous works, research on SDGs under regime-switching models is relatively less prevalent. This is primarily because games under such models are often not amenable to solutions in closed-form, thereby posing challenges for study.

    The first application of the Markov regime switching models in economics was proposed by [24] and consisted of the analysis of business cycles. The business cycle interpretation of the model relied on the combined analysis of the signs of the regime-specific intercept terms and the historical narrative about the periods with high values of the smoothed state probabilities for each of the regimes. Accordingly, a negative value of the intercept term coincided with the periods of economic recessions, whereas its positive value was associated with economic expansions. The regime-switching framework is particularly useful for understanding the behavior of financial markets and insurance surplus processes under varying economic conditions, making it a powerful tool for both theoretical analysis and practical applications. For detailed topics in this model, we refer to [16,34,48].

    The focus of investigations has long been on solvency regulations pertaining to optimal investment and reinsurance strategies. [12] studied the impact of regulations on fair premium setting. There is an increasing attention on this topic recently. For example, [18] studied optimal investment and premium setting while there are solvency regulations; [9] researched optimal investment under VaR-regulations; [5] studied Pareto-optimal polices with solvency regulations; [1] derived optimal reinsurance design with solvency constraints. From a mathematical perspective, these optimal control problems are primarily modeled using single-period static frameworks, where various solvency requirements are incorporated as constraints into the optimization problems. The reason for not considering dynamic multi-period models is that quantifying solvency conditions based on control processes is often difficult to characterize or solve in dynamic models. This paper will draw on the ideas of [12] by using randomly arriving monitoring times to describe solvency constraints, with the aim of optimizing the decision-maker's performance before the arrival of these monitoring events.

    This paper addresses this issue within a competitive framework by formulating the problem as an SDG within random time horizons. The issues in [7] closely relate to the topic discussed in this paper. Compared to [7], we incorporate a regime-switching structure into the dynamic control model and focus on the impact of random time regulation. In [7], the HJBI (Hamilton-Jacobi-Bellman-Isaac) equation is an elliptic PDE (partial differential equation). However, in this paper, the HJBI equation takes two forms: when the intensity process of the regulation time is a function of an external Markov chain, it is a coupled elliptic PDE; when the intensity process is a deterministic function, the HJBI equation is a parabolic PDE. The explicit solution methods applied in [7] are invalid in this paper. Therefore, we explored two different methods for solving the aforementioned intensity processes.

    In many cases, it is assumed that the random regulation times follow an exponential distribution. In this paper, for practical relevance, we model the intensity of the random solvency regulation time using two different approaches. The first model assumes that the intensity process is a function of an external Markov chain, resulting in a time-homogeneous Markov chain itself. The motivation behind this approach is that a natural understanding is: external regulations are influenced by the external macroeconomic environment. When the environment is good, there is a possibility of lower default risk and thus less regulation intensity. A proper way to model such dependence is to assume that the arrival intensity is a function of the Markov chain. Both the constant arrival intensity and the Markov chain-modeled arrival intensity are time-homogeneous. Our other interest is to treat the time-inhomogeneous intensity process. For ease of exploration, we consider the intensity process as a deterministic function of time t.

    While [32,37] have explored SDGs with random durations, their models do not incorporate regime switching or an insurance context. When the intensity process of regulation time follows a Markov chain, we address the SDG by employing an auxiliary problem approach combined with a fixed-point method. We establish the expressions for optimal policies by resolving the auxiliary problem. In the regime-switching model context, we find that the optimal strategies for both players are akin to those in models without regime switching; however, players must dynamically adjust their strategies in response to the state transitions of the Markov chain.

    Through this approach, our study provides a theoretical foundation for investment games in stochastic environments and explores strategy formulation under the uncertainties of regulatory intensity and market state transitions. In the case where the intensity process is deterministic, the associated HJBI equation takes the form of a time-dependent parabolic PDE. To solve this equation, we propose a numerical method based on a Markov chain approximation scheme. Additionally, we present several numerical examples to demonstrate the effects of regime switching and random time solvency on the optimal policies. These examples illustrate how the systems dynamics influence the decision-making processes of both investors and highlight the significance of incorporating these factors into the investment strategies.

    The remainder of this paper is organized as follows: Section 2 introduces the model and outlines the key issues to be addressed. It also presents the HJBI equation that the value function of the SDG must satisfy. Section 3 discusses the scenario in which the intensity process is modeled as a Markov chain, while Section 4 examines the case where the intensity is treated as a deterministic function of time t. The paper concludes with a discussion of numerical methods, algorithms, and illustrative examples to highlight our findings. At last, for reading convenience, we put the list of important notations in the paper in the following Table 1.

    Table 1.  Summary of notations used in this paper.
    Notations Description
    {Xt, t0} External Markov chain modulating the dynamic of the market
    St(i),i=1,2 Price of the financial market
    θi,i=1,2 Sharpe ratio of the two financial market
    {ft,gt, t0} Investment policies adopted by Player A and B respectively
    {Zf,gt, t0} Ratio process of the two players under control f,g
    τ Inter-arrival random time of regulations
    τf,gx The first time that controlled process Zf,gt reaches x
    τf,g The first exit time of Zf,gt with Z0=z[l,u]
    vf,g(z,αi) Performance function of the SDG with initial state (Z0,X0)=(z,αi))
    when external regulation time is time homogeneous
    vf,g(t,z,αi) Performance function of the SDG with initial state (Zt,Xt)=(z,αi))
    when external regulation time is time inhomogeneous
    vAu,f,g(z,αi) Performance function of the SDG with initial state (Z0,X0)=(z,αi))
    and stopped at the change the external Markov chain state
    Jfh,gh(t,z,αi) performance function of the approximating Makov chain
    VAu(z,αi) Value function of auxiliary SDG
    P((z,αi),(z+h,αi)|fh,gh) Transition probability of approximating Makov chain

     | Show Table
    DownLoad: CSV

    Let (Ω,F,P) be a complete probability space endowed with right-continuous, P-completed filtration {Ft,t0}. Assume that there are two correlated risky assets S(1)t and S(2)t, a risk-free bond Bt and an external environment evolution process X:={Xt, t0}. While we allow both investors to invest in risk-free market, A chooses S(1)={S(1)t, t0} and investor B chooses S(2)={S(2)t, t0}. Assume that X:={Xt, t0} is a continuous-time, finite-state, observable Markov chain taking values in state space X:={α1,α2,,αd}, d2. W(i)t, i=1,2 are two correlated Brownian motions with coefficient ρtˆ=ρ(Xt). Let Q:=[qij]i,j=1,2,...,d be the generator of X. For each i,j=1,2,,d, qij means the constant intensity that the Markov chain X changes from state αi to state αj. Assume that qij>0, dj=1qij=0, so qii<0. Denote by qi=qii>0. Let QT be the transpose of a matrix, or a vector Q. [15] presented the semi-martingale dynamics of X as

    Xt=X0+t0QTXudu+Mt,

    where {Mt,t0} is a martingale with respect to {Ft,t0}. Denote by τi the ith jumping time of Xt, then we have following Lemma 2.1 (c.f. [22]).

    Lemma 2.1.

    P(τ1>t|X0=αi)=eqit; (2.1)
    P(τ1t,Xτ1=αj|X0=αi)=(1eqit)qijqi; (2.2)
    P(Xτ1=αj|X0=αi)=qijqi. (2.3)

    Assume that the risky assets are evolved as

    dS(k)t=S(k)t(μk(Xt)dt+σk(Xt)dW(i)t),k=1,2,

    where μk(Xt)>0,σk(Xt), k=1,2 are return rates and volatilities of the two risky assets respectively. The dynamic of the risk-free asset is

    dBt=r(Xt)Btdt,

    where r()0 for all αi,i=1,2,,d. Denote by

    θkt(Xt):=μk(Xt)r(Xt)σk(Xt), k=1,2

    the Sharpe ratio or the market price of risk associated to asset S(k) at time t.

    Denote by ft the proportion of A 's wealth in risky asset S(1) and by gt the proportion of B's wealth in risky asset S(2). We made the following assumption on the control policies:

    Assumption 1. (1) ft(or gt) is an anticipated, measurable function with respect to Ft and satisfies

    E[T0f2tdt]<(orE[T0g2tdt]<),T<. (2.4)

    (2) Both short selling and borrowing are allowed in trading. Specifically, we allow that ft1 (borrow) or ft<0 (short selling) and so does gt.

    Denote by Yft (or Ygt) the wealth of investor A (or B) under policy {ft,t0} (or {gt,t0) with Yf0=x0 (or Yg0=y0), then the dynamic of Yft (or Ygt) is given by

    dYft=Yft([ftσ1(Xt)θ1t(Xt)+r(Xt)]dt+ftσ1(Xt)dW(1)t) (2.5)

    or

    dYgt=Ygt([gtσ2(Xt)θ2t(Xt)+r(Xt)]dt+gtσ2(Xt)dW(2)t). (2.6)

    While there are many competition objectives, we just focus on the games with payoffs related to the achievements of relative performance goals and shortfalls. For two numbers l<u with lx0y0u, we say that investor A attains its upper performance u if Yft=uYgt, for some t>0, and that lower shortfall occurs if Yft=lYgt, for some t>0. In general, A wins the game if A attains its upper performance before it reaches the lower shortfall, while B wins the game when the converse happens. In this paper, we further consider the regulation time impact on the decisions of both investors, where regulation time is specified by Definition 2.2. Similar to the discussion in [7], some specific games we consider here, starting from the perspective view of investor A, are (within regulation time)

    ● maximizing the probability that performance goal u is attained before the shortfall l occurred;

    ● minimizing the expected time of the performance u attained;

    ● maximizing the expected total discounted reward upon performance u reached.

    Similar to the framework in [7], we investigate the ratio of two wealth processes. Denote the ratio process by Zf,gt with Zf,gt=YftYgt, then the dynamic of Zf,gt is given by

    dZf,gt=Zf,gt[m(ft,gt,Xt)dt+ftσ1(Xt)dW(1)tgtσ2(Xt)dW(2)t], (2.7)

    where

    m(ft,gt,Xt)ˆ=ftσ1(Xt)θ1t(Xt)gtσ2(Xt)θ2t(Xt)ftgtσ1(Xt)σ2(Xt)ρ(Xt)+g2tσ22(Xt). (2.8)

    Note that {(Zf,gt,Xt)}, t0 is a vector valued Markov process with (Z0,X0)=(x,αi), then the infinitesimal operator of process {Zf,gt,Xt}, t0 is given by (suppose that function F belongs to the domain of operator A)

    Af,gF(t,z,αi)=Ft(t,z,αi)+m(f,g,αi)zFz(t,z,αi)+12ν2(f,g,αi)z2Fzz(t,z,αi)+dj=1qijF(t,z,αj),

    where Ft,Fz,Fzz are the first partial derivative of F(,) w.r.t. t and the first partial derivative and the second partial derivative of F(,) w.r.t. z,

    μki=μk(αi),σki=σk(αi), ri=r(αi),ρi=ρ(αi),θki=μkiriσki,m(f,g,αi)=fσ1iθ1igσ2iθ2ifgσ1iσ2iρi+g2σ22i,ν2(f,g,αi)=f2σ21i+g2σ22i2fgσ1iσ2iρi, k=1,2,i=1,2,,d. (2.9)

    Let

    κi=θ1iθ2i (2.10)

    denote the ratio of the market prices of two risk assets in finance. We will see later that the parameter κi is a measure of the degree of advantage one player has over the other. A is said to have the advantage if κi>1 and B is said to have the advantage if κi<1.

    Define by τf,gx the first time that controlled process Zf,gt reaches x[l,u] and by τf,g=min{τf,gl,τf,gu} the first exit time of Zf,gt with Z0=z[l,u].

    Definition 2.2 (Regulation time) Assume that τ is the inter-arrival random time of regulations for two investors. We assume that there exists a nonnegative stochastic process λs,s0, such that

    P(0λsds=+)=1; (2.11)
    P(τ>t)=E[exp(t0λsds)]. (2.12)

    Remark 1. Usually, mortality function λs,s0 is constant or a deterministic function (c.f. [33]). In this paper, we additionally consider the case that λs,s0 is a function of {Xt,t0} and thus is a Markov chain. There is a natural explanation for this model: the external environment not only affects the performance of the financial market, but the regulation frequency from the administrator is variable to the current state of the environment. For notation ease, denote by λi=λ(αi).

    In this subsection, we consider the case that the intensity process is the function of Markov chain Xt, i.e. λt=λ(Xt)>0. Due to the fact that exponential distribution is "memoryless" and λs is a Markov process, the performance function of SDG is of the form

    vf,g(z,αi)=Ez,αi[τf,gτ0eδsc(Zf,gs)ds+eδ(τf,gτ)h(Zf,gτf,gτ)], (2.13)

    where Ez,αi means the condition expectation operator Ez,αi=E[|Z0=z,X0=αi], c() is the reward(cost) function, and h() is the terminal reward (terminal punishment) of the game.

    Remark 2. We assume that c() and h() satisfies the polynomial growth condition, say,

    |c(z)|C(1+|z|p),|h(z)|C(1+|z|p)

    for suitable C,p. The coefficients in (2.5) (or (2.6)) satisfy condition (5.2) and (5.3) in IV 5 of [19]. With results of Appendix D in [19], it follows that (2.5) (or (2.6)) admits a path-wise unique solution Yft (or Ygt), which is Ft-progressively measurable and has continuous sample paths. With similar discussion, the existence of solution of the stochastic differential equation (SDE) specified by Eq (2.7) is guaranteed. With the help of aforementioned assumptions, just as it was claimed in IV 5 of [19], the performance function (2.13) is well-defined.

    The two investors compete in the following form: A wants to maximize payoff function vf,g(z,αi) while B wants to minimize vf,g(z,αi). We consider here only perfect observed competition, that is to say, the policy adopted by one investor at any time could be directly observed by the opponent investor instantaneously. Let

    V_(z,αi)=supfinfgvf,g(z,αi),ˉV(z,αi)=infgsupfvf,g(z,αi) (2.14)

    be the lower value function and upper value function of the game respectively.

    Definition 2.3. If V_(z,αi)=ˉV(z,αi), we call that the value function of the game exists, and naturally is given by

    V(z,αi)=V_(z,αi)=ˉV(z,αi). (2.15)

    This value can be attained if a saddle point for the payoff vf,g(z,αi),i=1,2,,d,x[l,u] exists, i.e. there exist f={ft, t0} and g={gt, t0} such that for all (z,α)[l,u]×X and all admissible f and g, the following relations hold:

    vf,g(z,αi)vf,g(z,αi)vf,g(z,αi). (2.16)

    Then, v(z,αi)=vf,g(z,αi) and, thus, the saddle points exist and are given by f,g.

    The second case assumes that λt is no longer a function of time homogeneous Markov chain Xt, but a deterministic function of t. In this case, we note that the performance function of the game not only relies on the current state z of the controlled system, but also the current time t. For notation ease, introduce

    Et,z,αi=E[  |(Zf,gt,Xt)=(z,αi)],Pt,z,αi=P[  |(Zf,gt,Xt)=(z,αi)]. (2.17)

    Let vf,g(t,z,αi) be the payoff performance function under the policies f and g with initial value (t,z,αi) and regulation time τ, which is defined by

    vf,g(t,z,αi)=Et,z,αi[τf,gτteδsc(Zf,gs)ds+eδ(τf,gτ)h(Zf,gτf,gτ)]. (2.18)

    We similarly define the value function and saddle of the game in this case as follows.

    Definition 2.4. Let

    V_(t,z,αi)=supfinfgvf,g(t,z,αi),ˉV(t,z,αi)=infgsupfvf,g(t,z,αi) (2.19)

    be the upper value and lower value of the SDG (2.18), respectively. If

    V_(t,z,αi)=ˉV(t,z,αi) (2.20)

    we call that the value function of value of the SDG (2.18) exists and is given by

    V_(t,z,αi)=V(t,z,αi)=ˉV(t,z,αi). (2.21)

    If there exist f={ft, t0} and g={gt, t0} such that for all (t,z,αi)[l,u]×X and all admissible f and g,

    vf,g(t,z,αi)vf,g(t,z,αi)vf,g(t,z,αi) (2.22)

    then

    V(t,z,αi)=vf,g(t,z,αi) (2.23)

    and thus the saddle points exist and are given by f,g.

    Remark 3. The existence of the value function and the saddle point of SDG plays a fundamental role in the study of SDG. For instance, see the works of [8,14,20,41]. However, there are various challenges in proving the existence of value functions, depending on the framework of the current SDG. The characteristic of the SDG in this paper is that, in addition to the control terms of the two players, it accommodates a Markov modulated structure in the drifts and diffusions, as well as an external random "stoping" time. The focus of this paper is to find optimal policies for the players. Motivated by the results of [40] and [26], we find that it suffices to verify the conditions A1)–A5) and A7) from [40]; our framework meets these conditions. Consequently, Theorem 5.3 of [40], which establishes the existence of the value function in SDG with a Markov regime switching structure over a stochastic time horizon, is applicable to our context.

    We first introduce some notations and definitions:

    (1) For any function V(z,αi),i=1,2,,d with continuously second order partial derivative w.r.t. z, let's denote by Θ the differential operator specified by

    ΘV(z,αi)=(1ρ2i)[Vz(z,αi)+zVzz(z,αi)]2Vz(z,αi)2. (3.1)

    (2) V(z,αi),i=1,2,,d is said to be sufficiently fast-increasing on an interval (a,b) if the following condition holds:

    2Vz(z,αi)+zVzz(z,αi)>0 (3.2)

    for i=1,2,,d and z[l,u].

    We note that in our model, the advantage of the two investors is variable with respect to the economic environment, which differs significantly from the case presented by [7], making our problem more complex and realistic in practice. The following Theorem 3.1 presents the HJBI equation associated with problem (2.15). The proof of this theorem is similar to that of Theorem 4.1, so we only provide the proof for Theorem 4.1.

    Theorem 3.1. Suppose that the value function V(z,αi):[l,u]×XR,i=1,2,...,d has continuously second order partial derivatives w.r.t. z, strictly concave, fulfilling condition (3.2), then V(z,αi),i=1,2,,d solve the following equations for all z[l,u]:

    zVz(z,αi)22ΘV(z,αi)θ22i[(2κi(ρiκi)Vz(z,αi)(1+κ2i2ρiκi)zVzz(z,αi)]+c(z)(λi+δ)V(z,αi)+dj=1qijV(z,αj)=0,i=1,2,,d, (3.3)

    with

    V(l,αi)=h(l)andV(u,αi)=h(u)fori=1,2,,d.

    If w(z,αi),i=1,2,,d solve coupled HJBI equations (3.3) and satisfy

    (1) for all admissible policies f and g and for all t0,

    t0E[Zf,gswz(Zf,gs,Xs)]2[f2s+g2s]ds<; (3.4)

    (2) function

    zwz(z,αi)wz(z,αi)+|zwzz(z,αi)||Θw(z,αi)| (3.5)

    are uniformly bounded for all i=1,2,,d.

    Then we have

    w(z,αi)=V(z,αi) (3.6)

    and the feedback optimal controls are given by

    f(z,αi)=θ1iσ1i(wz(z,αi)Θw(z,αi))[(ρiκi1)(wz(z,αi)+zwzz(z,αi))], (3.7)
    g(z,αi)=θ2iσ2i(wz(z,αi)Θw(z,αi))[(1ρiκi)(wz(z,αi)+zwzz(z,αi))]. (3.8)

    Moreover,

    fg=σ2i(ρiκi)σ1i(1ρiκi). (3.9)

    Deriving explicit expressions for the coupled HJBI equations (3.3) is generally not straightforward. In [45], a stochastic differential game was considered, yet explicit solutions were derived only under specific constraints on the system's coefficients. In this paper, we adopt the "fixed point method" from [25] to investigate optimal dividends within a Markov regime-switching model. This approach has been applied by [50] for singular optimal dividend control in a regime-switching Cramér-Lundberg model with interest on credit and debit, by [21] for portfolio optimization in a regime-switching market with derivatives, and by [46] for optimal investment and dividend strategies involving tax payments. Here, we re-examine a game problem subject to random time regulation constraints, where the process halts if the current regime switches. Specifically, let τ1, denote the first instance the environment shifts. We then define an auxiliary game problem as follows:

    Auxiliary performance function:

    vAu,f,g(z,αi)=Ez,αi[τf,gττ10eδsc(Zf,gs)ds+eδ(τf,gττ1)h(Zf,gτf,gττ1)]. (3.10)

    Auxiliary value function:

    VAu(z,αi)=supfinfgvAu,f,g(z,αi)=infgsupfvAu,f,g(t,z,αi). (3.11)

    With similar discussion to Theorem 3.1, the HJBI equation associated with the auxiliary problem is given by Corollary 1.

    Corollary 1. Suppose that the current state of external environment is αi, VAu(z,αi) is a function with continuously second order partial derivatives w.r.t. z, strictly concave, fulfilling condition (3.2), then VAu(z,α) solves the following equation for all z[l,u]:

    zVAuz(z,αi)22ΘVAu(z,αi)θ22i[2ki(ρiki)VAuz(z,αi)(1+k2i2ρiki)zVAuzz(z,αi)]+c(z)(λi+δ+qi)VAu(z,αi)=0, (3.12)

    with

    VAu(l,αi)=h(l)andVAu(u,αi)=h(u).

    For any give αi, if there exists a regular solution w(z,αi) to (3.12) that satisfies analogue conditions to (3.4) and (3.5), then

    w(z,αi)=VAu(z,αi) (3.13)

    and "feedback optimal control" have the same form as it were in (3.7) and (3.8).

    Proof. The proof is very similar to the one for Theorem 3.1 of [7] and we omit here.

    We note that the HJBI equation in the auxiliary problem is not coupled; it is valid only until the current state changes. Assuming the current time is zero, the effective time interval for this policy is given by [0,ττ1). For the remainder of this section, we will proceed under the assumptions outlined in Corollary 1.

    Inspired by the Markov property of {Xt,t0}, we introduce a candidate control process {ft,gt,t0} for the original problem over the entire control time interval as

    ft=fAu,X(t)=fAu,X(τk),if  τkt<τk+1, (3.14)
    gt=gAu,X(t)=gAu,X(τk),if  τkt<τk+1. (3.15)

    We observe that the candidate control process is piecewise deterministic, contingent solely on the current environment state. Consequently, under this policy, investors A and B each adopt environment-specific strategies and adjust their policies only upon state changes. Theorem 3.2 below establishes that the policies derived from Eqs (3.14) and (3.15) are indeed optimal for both investors. The proof, for brevity, is provided in Appendix B.

    Theorem 3.2. Suppose that λt=λ(Xt), then the controlled process defined by Eqs (3.14) and (3.15) are optimal for both investors.

    Proof. For reading convenience, we put the proof in Appendix A.

    In this subsection, we analyze the auxiliary game problem aimed at maximizing or minimizing a player's expected time to outperform their opponent. Specifically, we focus on Investor A's objective to minimize the expected duration of victory, as represented in the value function:

    N(h,αi)=inffsupgEz,αi[τf,guττ1]=supginffEz,αi[τf,guττ1]. (3.16)

    Similarly, let ˜N(z,αi)=infgsupfEz,αi[τf,guττ1]=supfinfgEz,αi[τf,guττ1], so N(z,αi)=˜N(z,αi). Note that in this case, c()1, δ0, d()=0; thus, by Corollary 1, ˜N(z,αi) is the solution to equation

    z˜Nz(z,αi)22Θ˜N(z,αi)θ22i[2ki(ρiki)˜Nz(z,αi)(1+k2i2ρiki)z˜Nzz(z,αi)]+1(λi+qi)˜N(z,αi)=0, (3.17)

    with boundary condition ˜N(u,αi)=0. [7] solved Eq (3.17) when λi+qi=0, which motivated us to find an explicit expression for ˜N(z,αi) in a special case. Assume that ˜N(z,αi) is of the form of ˜N(z,αi)=1λi+qi[(zu)ζ+1]. By [7], we can get the solution of the problem, and the final result is

    ˜N(z,αi)=1λi+qi[(zu)ζ++1],

    where the form of ζ+ is

    ζ+=θ22i(1k2i)+Δ2θ22i(1+k2i2ρiki)+4(λi+qi)(1ρ2i),
    Δ=[θ22i(1k2i)]2+8(λ+λ1)θ22(1+k2i2ρiki)+16(λi+qi)2(1ρ2i).

    Finally, the value of (3.16) is given by

    N(h,αi)=1λi+qi[(zu)ζ++1]. (3.18)

    Then,

    Nz=1λi+qi1uζ+(zu)ζ+1,
    Nzz=1λi+qi1uζ+(ζ+1)(zu)ζ+2.

    By calculation, the associated saddle point is given by

    f(z)=θ1iσ1i[(ρi/ki1)ζ+1(1ρ2i)(ζ+)21] and g(z)=θ2iσ2i[(1ρiki)ζ+1(1ρ2i)(ζ+)21]. (3.19)

    In this game, player A aims to maximize the probability of reaching a higher level u, while player B aims to minimize it. When the game involves a single player and as u approaches infinity with l=0, this problem simplifies to minimizing the ruin probability in the presence of investment opportunities, as discussed in [6]. According to Theorem 3.2, for a given current external state αi, it is necessary to first solve a single-state optimization problem. Now, let ˜R(z,αi) be the value function of the auxiliary game, then,

    ˜R(z,αi)=supfinfgPz,αi(Zτf,gττ1=u)=supfinfgPz,αi(τf,gττ1=τf,gu).

    Note that in this case, c()0, δ0, h()=1{Zτf,gττ1=u}; thus, by Corollary 1, ˜R(z,αi) is the solution to equation

    z˜Rz(z,αi)22Θ˜R(z,αi)θ22i[2ki(ρiki)˜Rz(z,αi)(1+k2i2ρiki)z˜Rzz(z,αi)](λi+ai)˜R(z,αi)=0, (3.20)

    with boundary condition ˜R(u,αi)=1, ˜R(l,αi)=0. Substituting the expression of Θ˜R(z,αi) into (3.20) yields

    2ki(ρiki)z˜R2z(z,αi)θ22i(1+k2i2ρiki)z˜Rz(z,αi)θ22i2(λi+qi)z2˜R(z,αi)˜Rzz(z,αi)22(λi+qi)z˜R(z,αi)˜Rz(z,αi)˜Rzz(z,αi)=ρi(λi+qi)˜Rz(z,αi)2+2ρi(λi+ai)z˜Rz(z,αi)˜Rzz(z,αi)+ρi(λi+qi)z2˜Rzz(z,αi)2=0. (3.21)

    This equation can be tracked by numerical method.

    The following Theorem 4.1 gives the HJBI equation associated with the SDG problem when λt is a deterministic function. For convenience, the proof of Theorem 4.1 is provided in the Appendix.

    Theorem 4.1. Suppose that λt is a positive deterministic function of t, w(t,z,αi):[l,u]×XR is a function with continuous second-order partial derivatives w.r.t. z, strictly concave, fulfilling condition (3.2), and solves the following equation for all z[l,u]:

    wt(t,z,αi)+zw2z(t,z,αi)2Θw(t,z,αi)θ22i[(1k2i)wz(t,z,αi)(1+k2i2ρiki)(wz(t,z,αi)+zwzz(t,z,αi))]+c(z)(δ+λt)w(t,z,αi)+dj=1qijw(t,z,αj)=0,i=1,2,,d, (4.1)

    with

    w(t,l,αi)=h(l)andw(t,u,αi)=h(u)fori=1,2,,d.

    We further suppose that

    (1) for all admissible policies f and g and for all t0,

    t0E[Zf,gswz(s,Zf,gs,Xs)]2[f2s+g2s]ds<; (4.2)

    (2) function

    zwz(t,z,αi)wz(t,z,αi)+|zwzz(t,z,αi)||Θw(t,z,αi)| (4.3)

    is uniformly bounded for all i=1,2,,d.

    Then, w(t,z,αi) is the value function of SDG, i.e.,

    w(t,z,αi)=vf,g(t,z,αi)=supfinfgvf,g(t,z,αi)=infgsupfvf,g(t,z,αi).

    The "feedback" saddle points of this SDG are specified by

    f=θ1iσ1i(wz(t,z,αi)Θw(t,z,αi))[(ρiki1)(wz(t,z,αi)+zwzz(t,z,αi))wz(t,z,αi)], (4.4)
    g=θ2iσ2i(wz(t,z,αi)Θw(t,z,αi))[(1ρiki)(wz(t,z,αi)+zwzz(t,z,αi))wz(t,z,αi)]. (4.5)

    Section 3 presents two examples of the SDG in which the arrival intensity of regulation is piece-wise constant. However, in more general cases, deriving explicit solutions can be challenging, necessitating a shift to numerical methods. Since a Markovian SDG problem can be treated as a Markovian control problem, the approach to constructing numerical schemes for SDG can leverage numerical methods for stochastic control (see [26,28,38,39]). Note that the controlled wealth process is a map [0,)[l,u] and stopped at τf,gτ.

    Let h>0 and define Lh={z:z=l+kh,k=0,1,2,,[ulh]+1}, where [] is the integer function. Lh is a discrete segmentation of interval [l,u], where {(ξhk,ehk),k<} is a controlled Markov chain on Lh, where {ξhk,k<} is used to approximate the underlying controlled wealth process {Zf,gt,t0}, and {ehk,k<} is the discrete time observation of the external environment process {Xt,t0}. Hence, for any chosen h, the domain of our numerical schemes with step h is

    Dh={(z,αi): i=1,2,...,d, zLh,i=1,2,,d}. (4.6)

    The design of the approximate Markov chain within the domain Dh is analogous to that presented in [40]. This controlled Markov chain is constructed to be both discrete-time and finite-state for computational efficiency, while adhering to the local consistency properties of the controlled state system. Therefore, a crucial step in designing this Markov chain is establishing the transition probabilities. Denote the transition probability from state (x,αi) to (y,αj) under control (fh,gh) by P((x,i),(y,j)|fh,gh). To determine P((x,i),(y,j)|fh,gh), we have to meet the following three conditions:

    (1) (Local moment consistent) It is crucial to determine P((x,i),(y,j)|fh,gh) such that the Markov chain {ξhk,k1} has the same first and second moment with the Zf,gt in a very small time interval.

    (2) (Continuous time Markov chain and value function) To approximate the continuous time controlled state process Zf,gt, we should choose an appropriate continuous time interpolation in any small time epoch. Suppose an interpolation epoch Δthk=Δthk(ξhk,ehk)>0,k1 is given, define interpolated time thk=k1s=1Δths. Then, a piece-wise constant interpolation ξht is specified by

    ξht=ξhk for t[thk,thk+1). (4.7)

    The interpolation interval satisfies

    infzLhΔthk(z)>0 and limh0supzLhΔthk(z)=0. (4.8)

    For the continuous time interpolated process {ξht,eht, t0}, define by

    τfh,ghh=inf{t:ξhtˉ[l,u]} (4.9)

    the first exit time of Markov chain {ξt,t0} and by

    Nh1ˆ=τfh,ghh[τh]. (4.10)

    An approximating performance function is then defined as

    Jfh,gh(t,z,αi)={Ex,αi[Nh1n=0eδnhc(ξhn)+eδNhh(ξhNh)],if tτ;Ex,αi[Nh1n=[th]eδnhc(ξhn)+eδ(Nhth)h(ξhNh)],if t<τ. (4.11)

    The upper value function and lower value function of the controlled Markov chain is then given by

    ¯Vh(t,x,αi)=infghAsupfhAJfh,gh(t,z,αi),V_h(t,x,αi)=supfhAinfghAJfh,gh(t,z,αi).

    Notably, the control problem in this paper consists of an external regulation time, with involves in stopping time of the controlled system. For implementing the computation steps, we need to discretize the integration 0λsds as follows. Let

    λhj=thjthj1λsds  and  Λhk=kj=1λhj,phj=eλhj  and  ˉFhk=eΛhk, j=1,2,. (4.12)

    Specifically, the discretized value function ¯Vh(t,z,αi) satisfies the following dynamic equation

    ¯Vh(thk,z,αi)=minghmaxfh{eqih[ˉFhk[P((z,αi),(z+h,αi)|fh,gh)¯Vh(thk+Δthk,z+h)+P((z,αi),(zh,αi)|fh,gh)¯Vh(thk+Δthk,zh,αi)]+phkP((z,αi),(z,αi)|fh,gh)¯Vh(thk+Δthk,z,αi)]+(1eqih)qijqi¯Vh(thk+Δthk,z,αj)+c(z)Δthk}. (4.13)

    Similarly, we have

    V_h(thk,z,αi)=maxfhmingh{eqih[ˉFhk[P((z,αi),(z+h,αi)|fh,gh)V_h(thk+Δthk,z+h)+P((z,αi),(zh,αi)|fh,gh)V_h(thk+Δthk,zh,αi)]+phkP((z,αi),(z,αi)|fh,gh)V_h(thk+Δthk,z,αi)]+(1eqih)qijqiV_h(thk+Δthk,z,αj)+c(z)Δthk}. (4.14)

    We know that if the value function of the game exists, then

    limh0¯Vh(thk,z,αi)=limh0V_h(thk,z,αi)=V(t,z,αi),i=1,2,. (4.15)

    (3) (Approximating to HJBI equations) Suppose that Vh(t,z,αi) is given by (4.15). The finite difference method indicates that we need to approximate the first and second derivatives of V(t,z) using step size h>0 as

    ¯Vh(t,z,αi)¯V(t,z,αi), (4.16)
    ¯Vh(t+h,z,αi)¯Vh(t,z,αi)hV(t,z,αi)t, (4.17)
    ¯Vh(t,z+h,αi)¯Vh(t,z,αi)hV(t,z,αi)z, if mhi>0, (4.18)
    ¯Vh(t,z,αi)¯Vh(t,zh,αi)hV(t,z,αi)z, if mhi0, (4.19)
    ¯Vh(t,z+h,αi)+¯Vh(t,zh,αi)2¯Vh(t,z,αi)h22¯V(t,z,αi)z2. (4.20)

    Now, we turn to determine transition probabilities. For notation convenience, let

    νhiˆ=ν2(fh,gh,αi)=fh2σ21i+gh2σ22αi2fhghσ1iσ2iρimhiˆ=fhσ1iθ1ighσ2iθ2ifhghσ1iσ2iρi+g2hσ22i,mh+i=max{mhi,0},mhi=min{mhi,0},mhi=mh+i+mhi,|mhi|=mh+imhi,i=1,2,,d. (4.21)

    Substituting (4.16)–(4.20) into (4.1) yields

    [h(1+mhiz)+νhiz2(λhj+δ)h2]¯Vh(thk,z,αi)=supfhmingh{¯Vh(thk+Δhtk,z)h+[mh+izh+12νhiz2]¯Vh(thk+Δhtk,z+h)+(mhihz+12νhiz2)¯Vh(thk+Δhtk,zh)}+c(z)+Δhtk+dj=1qij¯V(thk+Δhtk,z,αj). (4.22)

    By comparing coefficients of Eqs (4.13), (4.14) and (4.22), we can determine the transition probabilities of the constructed Markov chain, which are specified as

    P((z,αi),(z+h,αi)|fh,gh)=(ˉFhk1eqihphk)(mhizh+12νhiz2h(1+mhiz)+νhiz2(λhj+δ)h2), (4.23)
    P((z,αi),(zh,αi)|fh,gh)=(ˉFhk1eqihphk)(mh+izh+12νhiz2h(1+mhiz)+νhiz2(λhj+δ)h2), (4.24)
    P(z,z,αi|fh,gh)=1P((z,αi),(z+h,αi)|fh,gh)P((z,αi),(zh,αi)|fh,gh), (4.25)
    Δthk=(ˉFhk1eqihphk)(h2h(1+mhiz)+νhiz2(λhj+δ)h2). (4.26)

    It is easy to verify that

    0<P((z,αi),(z+h,αi)|fh,gh),   P((z,αi),(zh,αi)|fh,gh)<1,P((z,αi),(z+h,αi)|fh,gh)+P((z,αi),(zh,αi)|fh,gh)<1, (4.27)

    thus, the transition probabilities are well-defined.

    It is straightforward to verify that the constructed Markov chain, with transition probabilities specified by Eqs (4.23) and (4.24), meets the local consistency conditions indicated by Conditions (1)–(3). Specifically, under the aforementioned transition probabilities, the constructed Markov chain satisfies the following local consistency properties:

    EΔξhk=mhiΔthk+o(Δthk), (4.28)
    VarΔξhk=νhiΔthk+o(Δtkh). (4.29)

    Thus far, we have established the transition probabilities for the approximate Markov chain, as defined by Eqs (4.23) and (4.24). By substituting these transition probabilities and the interpolated time epochs into Eqs (4.13) and (4.14), we construct the iteration process for approximating the discrete-time value function with the prescribed boundary conditions

    Vh(t,l,αi)=h(l),  Vh(t,u,α)=h(u). (4.30)

    By letting h0 and Eqs (4.16)–(4.20), we can then approximate the value function and optimal investment and reinsurance policies numerically.

    Remark 4. Comparing with the algorithms in [26], one may observe that the primary distinction between the two algorithms lies in the modification of the Markov chains transition probabilities by the external regulation time. Consequently, these transition probabilities are no longer time-homogeneous, as they now depend on both the current time and the residual distribution of the regulation time.

    In a typical game problem, Agent A aims to maximize the probability of reaching the upper level u before regulation arrives or reaching the lower level l. The boundary condition for this scenario is given by

    Vh(t,l,αi)=0,Vh(t,u,αi)=1. (4.31)

    We note that this goal-reaching problem is well-known in finance. Optimal control problems on this subject have yielded extensive results. For example, [3] studied the optimization of a bequest goal problem at a random time, specifically the death time of the insured individual. More recent research in this area includes [29]. The parameter settings for this example are provided as follows.

    (1) Parameters of environment For the dynamic of environment, we only consider a two-state Markov chain, i.e., Xt(α1,α2). State e1 means that the macroeconomic environment is "bad" versus "good". Suppose that the Q-matrix of the Markov chain is given by

    (0.10.10.20.2).

    (2) Parameters of financial market For the financial market, we assume that the parameters are provided in Table 2. The Sharpe ratios for the two players operating in distinct environments are detailed in Table 3. Due to the setup of these parameters, it is apparent that the stock selected by Player A carries higher risk compared to the stock chosen by Player B. Nevertheless, the Sharpe ratio of the stock chosen by Player A exceeds that of Player B. Concurrently, the Sharpe ratio in a bull market surpasses that in a bear market. This parameter configuration is designed to closely mimic real-world conditions in our model.

    Table 2.  Parameters of financial market for two players.
    Player parameter bear market bull market
    A μ1 0.08 0.12
    A σ1 0.025 0.03
    B μ2 0.06 0.09
    B σ2 0.025 0.03
    risk free interest rate r 0.03 0.05

     | Show Table
    DownLoad: CSV
    Table 3.  Sharpe ratio for two investors.
    Player Sharp ratio bear market bull market
    A θ1 2 2.3
    B θ2 1.5 1.6

     | Show Table
    DownLoad: CSV

    r=(r1,r2)=(0.01,0.013),μ=(μ1,μ2)=(0.012,0.018),σS=(σS1,σS2)=(0.02,0.025). It can be observed that the Sharpe ratio in a "bad" environment is 0.1, while in a "good" environment it is 0.2. This indicates that the market price in a good environment is higher than in a bad environment.

    (3) Parameters of regulation tensity Assume that the force of mortality follows the famous Gompertz-Makeham law of mortality (c.f. [27]), i.e., the hazard rate λs is given by

    λs=(AeBs+C)exp[CsAB(eBs1)] (4.32)

    and

    ˉF(s)=exp(t0λsds)=exp[CtAB(eBt1)]. (4.33)

    One can find that the exponential distribution is a special case of the Gompertz-Makeham law. Here we adopted the parameter estimation results in [27] as an example, which are specified by

    A=0.0007,B=0.0006,c=0.0831. (4.34)

    The conditional expectation of the residual regulation time, given as current time t, is given by

    1BeAB(ABln(AB)C)ˉF(t). (4.35)

    Numerical results show that with the parameters given by (4.34), the expected lifetime is about 79.04. However, for solvency regulation, this time epoch is too long. Based on (4.35), we revise the parameter as

    A=0.07,B=0.06,c=0.0831. (4.36)

    Then, the expected regulatory time is 1.1715.

    Figure 1 presents the value function with respect to the residual regulation time t and current wealth z. The range of the residual regulation time t is [0,2], and the wealth interval is [2,10]. To enhance visualization, we standardize the scales of the horizontal and vertical axes, transforming the range of values to [0,100]. From Figure 1, it is evident that the value function of the game problem is smooth and convex over its domain. Figures 2 and 3 illustrate the investment amounts chosen by the two players.

    Figure 1.  Value function of goal reaching game.
    Figure 2.  Investment amount of A.
    Figure 3.  Investment amount of B.

    It can be observed that, at the onset of the game, each player tends to invest a significant amount in risky assets. However, as the game nears the "end of regulation time", the investment amounts become more stationary and conservative.

    Figure 4 provides a comparison of the value function for the goal-reaching game across different regime scenarios. Ψ1(z) represents the value function in a "bad" macroeconomic environment, while Ψ1(z) represents the value function in a "good" environment. One may observe that when current wealth is relatively low, the environment significantly impacts the value of the game. Conversely, when current wealth in the ratio process is relatively high, the value of the game converges. This phenomenon can be interpreted as follows: since both players operate under the same environment, when current wealth is very high, the environment has a diminished impact on the games winning probability.

    Figure 4.  Comparing of value function with different environment.

    To assess the sensitivity of numerical results to the parameters in this paper, we proceeded to compare the value function under varying parameters. Table 4 presents the value function for different values of μ1, with μ2=0.08, while keeping the other parameters constant as listed in Table 3. Similarly, Table 5 shows the value function for different values of μ2, with μ1=0.08 and the remaining parameters unchanged from Table 3. Upon examining these tables, it is evident that the numerical results exhibit stability in response to changes in parameters. Specifically, regarding the variations in μ1 and μ2, Table 4 reveals that an increase in μ1 leads to an increase in the probability of Player A winning the game, and this marginal effect increases with higher values of μ1. However, it is also noteworthy that as the current ratio z increases, the marginal effect due to increases in μ1? diminishes. In contrast, Table 5 indicates that an increase in μ2 results in a decrease in the probability of Player A winning, yet the marginal effect does not decrease, which differs from the findings in Table 4. Similar to the previous observation, the marginal effect in Table 5 also decreases with higher values of the current ratio z. This suggests that when Player A has a significant advantage over Player B, the benefits gained from selecting risky assets become less crucial for winning the game.

    Table 4.  Value function with various μ1.
    Current ratio z μ1=0.11 μ1=0.13 μ1=0.15
    0.000000 0.000000 0.000000 0.000000
    0.101266 0.265215 0.265609 0.266390
    0.202532 0.355853 0.356389 0.357510
    0.303797 0.444018 0.444608 0.445943
    0.405063 0.531364 0.532000 0.533563
    0.506329 0.618786 0.619368 0.620788
    0.607595 0.704511 0.705023 0.706283
    0.708861 0.788952 0.789406 0.790530
    0.810127 0.872271 0.872670 0.873683
    0.911392 0.954548 0.954900 0.955817
    0.974684 0.995346 1.005855 1.037951

     | Show Table
    DownLoad: CSV
    Table 5.  Value function with various μ2.
    current ratio z μ2=0.04 μ2=0.05 μ2=0.06
    0.000000 0.000000 0.000000 0.000000
    0.012658 0.154382 0.154380 0.154378
    0.202532 0.347252 0.347247 0.347241
    0.303797 0.435608 0.435601 0.435593
    0.405063 0.522875 0.522864 0.522855
    0.506329 0.610320 0.610307 0.610294
    0.607595 0.696173 0.696161 0.696148
    0.708861 0.780713 0.780700 0.780688
    0.810127 0.864116 0.864103 0.864090
    0.911392 0.946462 0.946450 0.946437
    0.974684 0.997462 0.997450 0.997438

     | Show Table
    DownLoad: CSV

    This paper investigates optimal investment games for two investors subject to random time solvency regulations. We first introduce administrative random time regulations into the stochastic investment game problem, providing a practical framework to understand the impact of regulation on fund managers risk management strategies. Additionally, we incorporate regime switching coefficients into the SDEs, enhancing the models applicability across various economic scenarios.

    Methodologically, we prove the regularity of the value function when the intensity of regulatory time is a Markov chain, enabling optimal feedback control. By approximating the derivatives of the value function using difference methods, we simplify the numerical computation process. Furthermore, we develop a numerical scheme for the value function when the intensity of regulatory time is a deterministic function of time, utilizing a Markov chain approximation approach to solve PDEs with time-dependent parameters. These methods ensure robust and efficient numerical solutions for complex control problems.

    On the other hand, the practical relevance of this paper can be enhanced in several ways. For instance, incorporating scenarios where the two players have distinct regulation intensities and exploring potential dependencies between their intensity processes would enrich the analysis. Additionally, a rigorous proof of the existence of solutions to the HJBI equation using viscosity solution theory could provide stronger theoretical support. Lastly, adopting appropriate statistical methods for parameter calibration is paramount for the practical application of the regime-switching model.

    Lin Xu: Validation, Methodology; Linlin Wang: Software, Investigation; Hao Wang: Resources, Writing-review & editing; Liming Zhang: Formal analysis, Software. All authors have read and approved the final version of the manuscript for publication.

    The authors express their sincere gratitude for the constructive comments provided by the reviewers and editors. Lin Xu was supported by National Natural Science Foundation of Anhui Province [grant number 2408085MA019], LinLin Wang was supported by National Natural Science Foundation of China [grant number 11971034], Hao Wang was supported by National Natural Science Foundation of China [grant number 12301597], Liming Zhang was supported by National Natural Science Foundation of China [grant number 12201006].

    The authors declare that they have no conflicts of interest.

    Proof. Note that in zero-sum game problem, policies adopted by one investor would be instantaneously obtained by his opponent, and thus, for both investors, the game problem became a minmax control problem or maxmin control problem. For later discussion convenience, we introduce the following shift operator of the Markov process. For detailed introduction on this concept, readers are referred to [44]. Let

    (X.,Zf,g.):(E×R)R+(E×R)R+ (A.1)

    be the controlled canonical state process. Define the shift operators θt:(E×R)R+(E×R)[t,) for t>0 by (θtω)s=ωs+t, s,tR+,ω(E×R)R+. It is clear that θtFt and θt(Xs,Zf,gs)=(Xt+s,Zf,gt+s). Let τ0=0, then we have

    τn+1=τn+θτnτ1, n=0,1,2, (A.2)

    For given suitable function w and control policies f and g, define two operators F and W on function w as

    Ff,gw(z,αi)=Ez,αi[τf,gττ10eδsc(Zf,gs)ds+eδ(τf,gττ1)w(Zf,gτf,gττ1)]. (A.3)

    Let

    ¯Ww(z,αi)=supfinfgFf,gw(z,αi),W_w(z,αi)=infgsupfFf,gw(z,αi) (A.4)

    and if W_w(z,αi)=¯Ww(z,αi), define

    Ww(z,αi)=W_w(z,αi)=¯Ww(z,αi). (A.5)

    By dynamic programming principle, for any policy g adopted by investor B, the value function for investor A satisfies

    ¯V(z,αi)=supfinfgEz,αi[τf,gττ10eδsc(Zf,gs)ds+eδ(τf,gττ1)[τf,gττf,gττf,gττ1eδsc(Zf,gs)ds+eδ(τf,gττf,gττ1)h(Zf,gτf,gτ)]]=supfinfgEz,αi[τf,gττ10eδsc(Zf,gs)ds+eδ(τf,gττ1)¯V(Zf,gτf,gττ1,Xτf,gττ1)] (A.6)
    =supfinfgFf,gV(z,αi)=¯WV(z,αi). (A.7)

    Similarly, we have

    V_(Zf,gτf,gττ1,Xτf,gττ1)=W_V(Zf,gτf,gττ1,Xτf,gττ1)V(Zf,gτf,gττn,Xτf,gττn)=WV(Zf,gτf,gττn,Xτf,gττn). (A.8)

    Thus, if we want to prove the regularities of the value function, we just need to prove the operator W is a contractive operator and the candidate polices specified by (3.14 and 3.15) are the right optimal policies.

    To see this, from the structure of {(ft,gt), t0}, we can see that, given the initial state X0=αi, the optimal strategy is f(z,αi),g(z,αi) before time τ1. Hence, if the current state of Xt is ξτn, the optimal strategy is f(Zτn,Xτn),g(Zτn,Xτn) before the next jump time of ξt. By noting that the operator Ff,gV(.,.,.) is defined by the path of (Zf,gt,Xt), t0 up to the first transition time, and using (A.8), we conclude that

    ¯Vf,g(z,αi)=E(z,αi)[τf,gττk0eδsc(Zs)ds+eδτk¯V(Zf,gτk,Xτk)],for k=1,2,3,. (A.9)

    Following the method in [25], we prove Eq (A.9) by the mathematical induction method. It is obviously true for k=1 (see Eq (A.8). Suppose that Eq (A.9) holds for k=n. Then,

    ¯V(z,αi)=Ez,αi[τf,gττn0eδsc(Zs)ds+eδτn¯V(Zτn,Xτn)1(τn<τf,gτ)]=Ez,αi[τf,gττn0eδsc(Zs)ds]+Ez,αi[eδτn¯V(Zτn,Xτn)1(τn<τf,gτ<τn+1)] (A.10)
    +E(z,αi)[eδτn¯V(Zτn,Xτn)1(τn+1<τf,gτ)]. (A.11)

    Note that ¯V(Zτn,Xτn)=θτn¯V(Z0,X0). By the induction hypothesis, we have that

    ¯V(Zf,gτn,Xτn)=θτn¯V(Z0,X0)=E(z,αi)[τn+1τneδsc(Zωs+τn)ds|Fτn] (A.12)
    +E(z,αi)[eδτn+11(τn+1<θτnτf,gτ)¯V(Zf,gτn+1,Xτn+1)|Fτn]. (A.13)

    Substituting Eqs (A.12) and (A.13) into Eqs (A.10) and (A.11), we have

    ¯V(z,αi)=E(z,αi)[(τf,g0eδsc(Zs)ds)1(τn<τf,g<τn+1)+(τn+10eδsc(Zs)ds)1(τn+1<τf,g)]+E(z,αi)[eδτn+1¯V(Zf,gτn+1,Xτn+1)1(τn+1<τf,gτ)]=E(z,αi)[τf,gττn+10eδsc(Zs)ds+eδτn+11(τn+1<τf,gτ)¯V(Zf,gτn+1,Xτn+1)]. (A.14)

    This indicates that (A.9) also holds for k=n+1. Since we have proved that ¯V(z,αi) is bounded and we note that limnτn=, letting n in the above equation, we have

    limneδτn1(τn<τf,gτ)¯V(Zf,gτn,Xτn)=E[eδτf,gτh(Xτf,gτ)] (A.15)

    and this indicates that under the policy (f,g), the performance function is really the value function and the operator is a conductive operator.

    Proof. The idea of proof is: for a certain investment strategy of B, investor A chooses the corresponding optimal investment strategy and applies the same method to B. Then, by Eq (2.16), we find the optimal differential game policies. Specifically, for any policy g adopted by investor B, the HJBI equation of investor A for maximizing vf,g(t,z,αi) is

    supf{Af,gvf,g(t,z,αi)+c(δ+λt)vf,g(t,z,αi)}=0. (A.16)

    Denote by ˜f(t,z,αi:g) the maximizer for investor A of Eq (A.16) under given policy g, and we further have

    A˜f,gv˜f,g+c(δ+λt)v˜f,g=0. (A.17)

    Assuming that v˜f,gzz<0 and then differentiating (A.17) on both sides w.r.t f yields that the maximizer ˜f(t,z,αi:g) is of the form

    ˜f(t,z,αi:g)=gσ2iρiσ1i(1+v,gz(t,z,αi)zv,gzz(t,z,αi))θ1iσ1iv,gz(t,z,αi)zv,gzz(t,z,αi) (A.18)

    where

    v,g(t,z,αi)=supfvf,g(t,z,αi)=v˜f,g(t,z,αi).

    Obviously,

    infgv,g(t,z,αi)=infgsupfvf,g(t,z,αi)=ˉV(t,z,αi)

    is the upper value of SDG.

    Similarly, for any given policy f adopted by Investor A, the HJBI equation of investor B for minimizing vf,g is given by

    infg{Af,gv(t,z,αi)+c(δ+λt)v(t,z,αi)}=0,

    then the minimizer for Investor B is specified by

    ˜g(t,z,αi:f)=θ2iσ2izvf,z(t,z,αi)2zvf,z(t,z,αi)+z2vf,zz(t,z,αi)+fρiσ1iσ2izvf,z(t,z,αi)+z2vf,zz(t,z,αi)2zvf,z(t,z,αi)+z2vf,zz(t,z,αi), (A.19)

    where vf,=infgvf,g=vf,˜g and also

    supfinfgvf,g(t,z,αi)=supfvf,=supfvf,˜g(f)=V_(t,z,αi)

    is the lower value function of SDG. Since we have shown that the saddle point of SDG (2.18) exists, then the game must have an achievable value with

    v,˜g=v˜f,.

    If this is the case, then we can substitute Eq (A.19) into Eq (A.18). By some manipulations, one can find that

    f(t,z,αi)=θ1iσ1i(vz(t,z,αi)Θv(t,z,αi))[(ρki1)(vz(t,z,αi)+zvzz(t,z,αi))vz(t,z,αi)]. (A.20)

    With a vice versa, it results in

    g(t,z,αi)=θ2iσ2i(vz(t,z,αi)Θv(t,z,αi))[(1ρiki)(vz(t,z,αi)+zvzz(t,z,αi))vz(t,z,αi)]. (A.21)

    Substituting (A.20) and (A.21) into (A.17) we finally find that the equations satisfied by value function v(t,z,αi) are of the form

    vt+zvz(t,z,αi)22Θv(t,z,αi)θ22i[(1k2i)vz(t,z,αi)(1+k2i2ρki)(vz(t,z,αi)+zvzz(t,z,αi))]+c(z)(δ+λt)v(t,z,αi)+dj=1v(t,z,αj)=0,i=1,2,,d, (A.22)

    and naturally with boundary conditions

    v(t,l,αi)=h(l)andv(t,u,αi)=h(u)fori=1,2,,d.

    One may find that Eq (4.1) is just the reformulation of Eq (A.22). Thus when the value function of the game exists and smooth enough, it solves the coupled HJB equation Eq (4.1). On the other hand, we need to verify that the solutions to coupled Eq (4.1) is the value function of SDG. Although we can rely on the result of [20] to complete our proof, here we want to prove our "verification" theorem by the "Martingale optimality principle" which is widely used in many literatures. For example, see [47].

    Suppose that w(t,z,αi), i=1,2,,d are the solutions to couple equations (4.1). For any policy pair (f,g), define a process

    Mf,gh:=eδhw(t+h,Zf,gt+h,Xt+h)+t+hteδsc(Zf,gs)ds. (A.23)

    Note that {(Zf,gt,Xt),t0} is a vector valued Markov process and {Xt,t0} is a process with bounded total variation on finite time interval; thus, by Itô's lemma (see [35]), we have that for any tτf,g,

    Mf,gh=Mf,g0+(t+h)τf,gτteδs[ws(s,Zf,gs,Xs)+Aw(s,Zf,gs,Xs)+c(Zf,gs)δw(s,Zf,gs,Xs)]ds+(t+h)τf,gτteδsZf,gswz(s,Zf,gs,Xs)[fsσ1(Xs)dW(1)sgsσ2(Xs)dW(2)s].

    If Eq (4.2) of Theorem 4.1 holds, then for any t0, Mf,gtτf,gτ is a local martingale, and further, if Eq (4.3) holds, then Mf,gtτf,gτ is uniformly integrable sup-martingale or sub-martingale, depending on the choice of investment policies f and g. Note that P(τf,gτ<)=1 and interval [l,u] is bounded, supposing that (Zf,gt,Xt)=(z,αi); and letting h, it is easy to find that

    vf,g(t,z,αi)=Et,z,αi[1{ττf,g>t}τf,gτteδtc(Zf,gt)ds+eδ(τf,gτ)h(Zf,gτf,gτ)]=Et,z,αi[Mf,gτf,gτ]=Et,z,αi[Mf,gh+τf,gt+heδs[Af,gw(s,Zf,gs,Xs)+c(Zf,gs)(δ+λs)w(s,Zf,gs,Xs)]ds]=Mf,g0+Et,z,αi[τf,gteδs[Af,gw(s,Zf,gs,Xs)+c(Zf,gs)(δ+λs)w(s,Zf,gs,Xs)]ds]=w(t,z,αi)+Et,z,αi[τf,gτteδs[Aw(s,Zf,gs,Xs)+c(Zf,gs)(δ+λs)w(s,Zf,gs,Xs)]ds]. (A.24)

    For given policy g adopted by Investor B and any policy f adopted by Investor A, Equation (A.16) implies that

    Af,gw+c(δ+λs)w0

    and

    Af,gw+c(δ+λt)w=0.

    Thus, by Eq (A.24) we have

    vf,g(t,z,αi)w(t,z,αi) (A.25)

    and vf,g(t,z,αi)=w(t,z,αi). By Eq (A.19), with a similar discussion, we find that for given policy f we have

    Af,gw+c(δ+λs)w0,Af,gw+c(δ+λs)w=0

    and

    vf,g(t,z,αi)w(t,z,αi)

    and

    vf,g(t,z,αi)=w(t,z,αi). (A.26)

    Together with Eqs (A.25) and (A.26), we have Eq (2.16), i.e.,

    vf,g(t,z,αi)vf,g(t,z,αi)=w(t,z,αi)vf,g(t,z,αi)

    which proves that the solutions of the coupled Eq (4.1) are the value of SDG.



    [1] R. Chteoui, Identifications of the coefficients of the Taylor expansion (second order) of periodic non-collision solutions for the perturbed planar Keplerian Hamiltonian system, AIMS Math., 8 (2023), 16528–16541. https://doi.org/10.3934/math.2023845 doi: 10.3934/math.2023845
    [2] H. Ounaies, Perturbation of planar, keplerian Hamiltonian systems, Nonlinear Anal.-Theor., 22 (1994), 675–696. Available from: https://www.sciencedirect.com/science/article/abs/pii/0362546X94902216?via%3Dihub
  • This article has been cited by:

    1. Hira Waheed, Akbar Zada, Rizwan Rizwan, Choonkil Park, Niamat Ullah, Qualitative analysis of coupled system of sequential fractional integrodifferential equations, 2022, 7, 2473-6988, 8012, 10.3934/math.2022447
    2. Rizwan Rizwan, Fengxia Liu, Zhiyong Zheng, Choonkil Park, Siriluk Paokanta, Existence theory and Ulam’s stabilities for switched coupled system of implicit impulsive fractional order Langevin equations, 2023, 2023, 1687-2770, 10.1186/s13661-023-01785-4
    3. Mi Zhou, Lu Zhang, Well-Posedness of a Class of Fractional Langevin Equations, 2024, 23, 1575-5460, 10.1007/s12346-024-00956-7
    4. Syed Omar Shah, Rizwan Rizwan, Yonghui Xia, Akbar Zada, Existence, uniqueness, and stability analysis of fractional Langevin equations with anti‐periodic boundary conditions, 2023, 46, 0170-4214, 17941, 10.1002/mma.9539
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1396) PDF downloads(90) Cited by(0)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog