Research article

Nash equilibria in risk-sensitive Markov stopping games under communication conditions

  • Received: 19 April 2024 Revised: 28 July 2024 Accepted: 05 August 2024 Published: 13 August 2024
  • MSC : 91A05, 91A30, 93C55, 93E20

  • This paper analyzes the existence of Nash equilibrium in a discrete-time Markov stopping game with two players. At each decision point, Player Ⅱ is faced with the choice of either ending the game and thus granting Player Ⅰ a final reward or letting the game continue. In the latter case, Player Ⅰ performs an action that affects transitions and receives a running reward from Player Ⅱ. We assume that Player Ⅰ has a constant and non-zero risk sensitivity coefficient, while Player Ⅱ strives to minimize the utility of Player Ⅰ. The effectiveness of decision strategies was measured by the risk-sensitive expected total reward of Player Ⅰ. Exploiting mild continuity-compactness conditions and communication-ergodicity properties, we found that the value function of the game is described as a single fixed point of the equilibrium operator, determining a Nash equilibrium. In addition, we provide an illustrative example in which our assumptions hold.

    Citation: Jaicer López-Rivero, Hugo Cruz-Suárez, Carlos Camilo-Garay. Nash equilibria in risk-sensitive Markov stopping games under communication conditions[J]. AIMS Mathematics, 2024, 9(9): 23997-24017. doi: 10.3934/math.20241167

    Related Papers:

  • This paper analyzes the existence of Nash equilibrium in a discrete-time Markov stopping game with two players. At each decision point, Player Ⅱ is faced with the choice of either ending the game and thus granting Player Ⅰ a final reward or letting the game continue. In the latter case, Player Ⅰ performs an action that affects transitions and receives a running reward from Player Ⅱ. We assume that Player Ⅰ has a constant and non-zero risk sensitivity coefficient, while Player Ⅱ strives to minimize the utility of Player Ⅰ. The effectiveness of decision strategies was measured by the risk-sensitive expected total reward of Player Ⅰ. Exploiting mild continuity-compactness conditions and communication-ergodicity properties, we found that the value function of the game is described as a single fixed point of the equilibrium operator, determining a Nash equilibrium. In addition, we provide an illustrative example in which our assumptions hold.



    加载中


    [1] A. Maitra, W. Sudderth, The gambler and the stopper, Lecture Notes-Monograph Series, 30 (1996), 191–208.
    [2] E. Dynkin, Game variant of a problem on optimal stopping, Soviet Math. Dokl., 0 (1969), 270–274.
    [3] G. Peskir, A. Shiryaev, Optimal stopping and free-boundary problems, Basel: Birkhäuser, 2006. http://dx.doi.org/10.1007/978-3-7643-7390-0
    [4] A. Shiryaev, Optimal stopping rules, Berlin: Springer Science & Business Media, 2007. http://dx.doi.org/10.1007/978-3-540-74011-7
    [5] T. Bielecki, D. Hernández-Hernández, S. Pliska, Risk sensitive control of finite state Markov chains in discrete time, with applications to portfolio management, Math. Meth. Oper. Res., 50 (1999), 167–188. http://dx.doi.org/10.1007/s001860050094 doi: 10.1007/s001860050094
    [6] G. Peskir, On the American option problem, Math. Finance, 5 (2005), 169–181. http://dx.doi.org/10.1111/j.0960-1627.2005.00214.x doi: 10.1111/j.0960-1627.2005.00214.x
    [7] E. Altman, A. Shwartz, Constrained Markov games: Nash equilibria, Proceedings of Advances in Dynamic Games and Applications, 2000,213–221. http://dx.doi.org/10.1007/978-1-4612-1336-9_11
    [8] R. Atar, A. Budhiraja, A stochastic differential game for the inhomogeneous Laplace equation, Ann. Prob., 38 (2010), 498–531. http://dx.doi.org/10.1214/09-AOP494 doi: 10.1214/09-AOP494
    [9] J. Filar, K. Vrieze, Competitive Markov decision processes, New York: Springer Science & Business Media, 2012. http://dx.doi.org/10.1007/978-1-4612-4054-9
    [10] V. Kolokoltsov, O. Malafeyev, Understanding game theory: introduction to the analysis of many agent systems with competition and cooperation, Hackensack: World Scientific, 2020.
    [11] L. Shapley, Stochastic games, PNAS, 39 (1953), 1095–1100. http://dx.doi.org/10.1073/pnas.39.10.1095 doi: 10.1073/pnas.39.10.1095
    [12] L. Zachrisson, Markov games, In: Advances in game theory, Princeton: Princeton University Press, 1964,211–254. http://dx.doi.org/10.1515/9781400882014-014
    [13] O. Hernández-Lerma, Adaptive Markov control processes, New York: Springer Science & Business Media, 2012. http://dx.doi.org/10.1007/978-1-4419-8714-3
    [14] M. Puterman, Markov decision processes: discrete stochastic dynamic programming, Hoboken: John Wiley & Sons, 2014. http://dx.doi.org/10.1002/9780470316887
    [15] R. Howard, J. Matheson, Risk-sensitive Markov decision processes, Manage. Sci., 8 (1972), 356–369. http://dx.doi.org/10.1287/mnsc.18.7.356 doi: 10.1287/mnsc.18.7.356
    [16] N. Bäuerle, U. Rieder, Markov decision processes with applications to finance, Heidelberg: Springer Science & Business Media, 2011. http://dx.doi.org/10.1007/978-3-642-18324-9
    [17] L. Stettner, Risk sensitive portfolio optimization, Math. Meth. Oper. Res., 50 (1999), 463–474. http://dx.doi.org/10.1007/s001860050081 doi: 10.1007/s001860050081
    [18] S. Balaji, S. Meyn, Multiplicative ergodicity and large deviations for an irreducible Markov chain, Stoch. Proc. Appl., 90 (2000), 123–144. http://dx.doi.org/10.1016/S0304-4149(00)00032-6 doi: 10.1016/S0304-4149(00)00032-6
    [19] I. Kontoyiannis, S. Meyn, Spectral theory and limit theorems for geometrically ergodic Markov processes, Ann. Appl. Probab., 3 (2003), 304–362. http://dx.doi.org/10.1214/aoap/1042765670 doi: 10.1214/aoap/1042765670
    [20] N. Bäuerle, U. Rieder, More risk-sensitive Markov decision processes, Math. Oper. Res., 39 (2014), 105–120. http://dx.doi.org/10.1287/moor.2013.0601 doi: 10.1287/moor.2013.0601
    [21] V. Borkar, S. Meyn, Risk-sensitive optimal control for Markov decision processes with monotone cost, Math. Oper. Res., 27 (2002), 192–209. http://dx.doi.org/10.1287/moor.27.1.192.334 doi: 10.1287/moor.27.1.192.334
    [22] K. Sladkỳ, Risk-sensitive average optimality in Markov decision processes, Kybernetika, 54 (2018), 1218–1230. http://dx.doi.org/10.14736/kyb-2018-6-1218 doi: 10.14736/kyb-2018-6-1218
    [23] G. Di Masi, Ł. Stettner, Infinite horizon risk sensitive control of discrete time Markov processes under minorization property, SIAM J. Control Optim., 46 (2007), 231–252. http://dx.doi.org/10.1137/040618631 doi: 10.1137/040618631
    [24] A. Jaśkiewicz, Average optimality for risk-sensitive control with general state space, Ann. Appl. Probab., 7 (2007), 654–675. http://dx.doi.org/10.1214/105051606000000790 doi: 10.1214/105051606000000790
    [25] R. Cavazos-Cadena, L. Rodríguez-Gutiérrez, D. Sánchez-Guillermo, Markov stopping games with an absorbing state and total reward criterion, Kybernetika, 57 (2021), 474–492. http://dx.doi.org/10.14736/kyb-2021-3-0474 doi: 10.14736/kyb-2021-3-0474
    [26] V. Martínez-Cortés, Bi-personal stochastic transient Markov games with stopping times and total reward criterion, Kybernetika, 57 (2021), 1–14. http://dx.doi.org/10.14736/kyb-2021-1-0001 doi: 10.14736/kyb-2021-1-0001
    [27] J. López-Rivero, R. Cavazos-Cadena, H. Cruz-Suárez, Risk-sensitive Markov stopping games with an absorbing state, Kybernetika, 58 (2022), 101–122. http://dx.doi.org/10.14736/kyb-2022-1-0101 doi: 10.14736/kyb-2022-1-0101
    [28] M. Torres-Gomar, R. Cavazos-Cadena, H. Cruz-Suárez, Denumerable Markov stopping games with risk-sensitive total reward criterion, Kybernetika, 60 (2024), 1–18. http://dx.doi.org/10.14736/kyb-2024-1-0001 doi: 10.14736/kyb-2024-1-0001
    [29] W. Zhang, C. Liu, Discrete-time stopping games with risk-sensitive discounted cost criterion, Math. Meth. Oper. Res., in press. http://dx.doi.org/10.1007/s00186-024-00864-1
    [30] F. Dufour, T. Prieto-Rumeau, Nash equilibria for total expected reward absorbing Markov games: the constrained and unconstrained cases, Appl. Math. Optim., 89 (2024), 34. http://dx.doi.org/10.1007/s00245-023-10095-1 doi: 10.1007/s00245-023-10095-1
    [31] R. Cavazos-Cadena, M. Cantú-Sifuentes, I. Cerda-Delgado, Nash equilibria in a class of Markov stopping games with total reward criterion, Math. Meth. Oper. Res., 94 (2021), 319–340. http://dx.doi.org/10.1007/s00186-021-00759-5 doi: 10.1007/s00186-021-00759-5
    [32] J. Saucedo-Zul, R. Cavazos-Cadena, H. Cruz-Suárez, A discounted approach in communicating average Markov decision chains under risk-aversion, J. Optim. Theory Appl., 87 (2020), 585–606. http://dx.doi.org/10.1007/s10957-020-01758-y doi: 10.1007/s10957-020-01758-y
    [33] P. Hoel, S. Port, C. Stone, Introduction to stochastic processes, Long Grove: Waveland Press, 1986.
    [34] R. Cavazos-Cadena, Characterization of the optimal risk-sensitive average cost in denumerable Markov decision chains, Math. Oper. Res., 43 (2018), 1025–1050. http://dx.doi.org/10.1287/moor.2017.0893 doi: 10.1287/moor.2017.0893
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(180) PDF downloads(30) Cited by(0)

Article outline

Figures and Tables

Figures(1)  /  Tables(2)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog