This paper considers the maneuvering penetration methods of missile which do not know the intercepting strategies of the interceptor beforehand. Based on reinforcement learning, the online intelligent maneuvering penetration methods of missile are derived. When the missile is locked by the interceptor, in terms of the tracking characteristics of the interceptor, the missile carries out tentative maneuvers which lead to the interceptor makes the responses respectively, in the light of the information on interceptor responses which can be gathered by the missile-borne detectors, online game confrontation learning is employed to increase the miss distance of the interceptor in guidance blind area by reinforcement learning algorithm, the results of which are used to generate maneuvering strategies that make the missile to achieve the successful penetration. The simulation results show that, compared with no maneuvering methods or random maneuvering methods, the methods proposed not only present higher probability of successful penetration, but also need less overload and lower command switching frequency. Moreover, the effectiveness of this maneuvering penetration methods can be realized under the condition of limited number of training.
Citation: Yaokun Wang, Kun Zhao, Juan L. G. Guirao, Kai Pan, Huatao Chen. Online intelligent maneuvering penetration methods of missile with respect to unknown intercepting strategies based on reinforcement learning[J]. Electronic Research Archive, 2022, 30(12): 4366-4381. doi: 10.3934/era.2022221
This paper considers the maneuvering penetration methods of missile which do not know the intercepting strategies of the interceptor beforehand. Based on reinforcement learning, the online intelligent maneuvering penetration methods of missile are derived. When the missile is locked by the interceptor, in terms of the tracking characteristics of the interceptor, the missile carries out tentative maneuvers which lead to the interceptor makes the responses respectively, in the light of the information on interceptor responses which can be gathered by the missile-borne detectors, online game confrontation learning is employed to increase the miss distance of the interceptor in guidance blind area by reinforcement learning algorithm, the results of which are used to generate maneuvering strategies that make the missile to achieve the successful penetration. The simulation results show that, compared with no maneuvering methods or random maneuvering methods, the methods proposed not only present higher probability of successful penetration, but also need less overload and lower command switching frequency. Moreover, the effectiveness of this maneuvering penetration methods can be realized under the condition of limited number of training.
[1] | P. Zarchan, Proportional navigation and weaving targets, J. Guid. Control Dyn., 18 (1995), 969–974. https://doi.org/10.2514/3.21492 doi: 10.2514/3.21492 |
[2] | J. I. Lee, C. K. Ryoo, Impact angle control law with sinusoidal evasive maneuver for survivability enhancement, Int. J. Aeronaut. Space Sci., 19 (2018), 433–442. https://doi.org/10.1007/s42405-018-0042-2 doi: 10.1007/s42405-018-0042-2 |
[3] | E. Garcia, D. W. Casbeer, M. Pachter, Design and analysis of state-feedback optimal strategies for the differential game of active defense, IEEE Trans. Autom. Control, 64 (2019), 553–568. https://doi.org/10.1109/TAC.2018.2828088 doi: 10.1109/TAC.2018.2828088 |
[4] | J. Ding, C. L. Li, G. S. Zhu, Two-person zero-sum matrix games on credibility space, in 2011 Eighth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), (2011), 912–916. https://doi.org/10.1109/FSKD.2011.6019721 |
[5] | T. Yang, L. Geng, M. Duan, K. Zhang, X. Liu, Research on the evasive strategy of missile based on the theory of differential game, in 2015 34th Chinese Control Conference (CCC), (2015), 5182–5187. https://doi.org/10.1109/ChiCC.2015.7260447 |
[6] | M. Flasiński, Symbolic artificial intelligence, in Introduction to Artificial Intelligence, (2016), 15–22. https://doi.org/10.1007/978-3-319-40022-8_2 |
[7] | L. P. Kaelbling, M. L. Littman, A. W. Moore, Reinforcement learning: a survey, J. Artif. Intell. Res., 4 (1996), 237–285. https://doi.org/10.1613/jair.301 doi: 10.1613/jair.301 |
[8] | R. S. Sutton, A. G. Barto, Reinforcement Learning: An Introduction, MIT press, 2018. |
[9] | S. Lei, Y. Lei, Z. Zhu, Research on missile intelligent penetration based on deep reinforcement learning, J. Phys. Conf. Ser., 1616 (2020), 012107. https://doi.org/10.1088/1742-6596/1616/1/012107 doi: 10.1088/1742-6596/1616/1/012107 |
[10] | X. Wang, Y. Cai, Y. Fang, Y. Deng, Intercept strategy for maneuvering target based on deep reinforcement learning, in 2021 40th Chinese Control Conference (CCC), (2021), 3547–3552. https://doi.org/10.23919/CCC52363.2021.9549458 |
[11] | L. Jiang, Y. Nan, Z. H. Li, Realizing midcourse penetration with deep reinforcement learning, IEEE Access, 9 (2021), 89812–89822. https://doi.org/10.1109/ACCESS.2021.3091605 doi: 10.1109/ACCESS.2021.3091605 |
[12] | K. Arulkumaran, M. P. Deisenroth, M. Brundage, A. A. Bharath, Deep reinforcement learning: a brief survey, IEEE Signal Process Mag., 34 (2017), 26–38. https://doi.org/10.1109/MSP.2017.2743240 doi: 10.1109/MSP.2017.2743240 |
[13] | S. Arora, P. Doshi, A survey of inverse reinforcement learning: challenges, methods and progress, Artif. Intell., 297 (2021), 103500. https://doi.org/10.1016/j.artint.2021.103500 doi: 10.1016/j.artint.2021.103500 |
[14] | S. Bradtke, M. Duff, Reinforcement learning methods for continuous-time markov decision problems, in Advances in Neural Information Processing Systems, 7 (1994). Available from: https://proceedings.neurips.cc/paper/1994/file/07871915a8107172b3b5dc15a6574ad3-Paper.pdf. |
[15] | B. Abdulhai, L. Kattan, Reinforcement learning: introduction to theory and potential for transport applications, Can. J. Civ. Eng., 30 (2003), 981–991. https://doi.org/10.1139/l03-014 doi: 10.1139/l03-014 |
[16] | F. L. Lewis, D. Vrabie, K. G. Vamvoudakis, Reinforcement learning and feedback control: using natural decision methods to design optimal adaptive controllers, IEEE Control Syst. Mag., 32 (2012), 76–105. https://doi.org/10.1109/MCS.2012.2214134 doi: 10.1109/MCS.2012.2214134 |
[17] | J. Kober, J. A. Bagnell, J. Peters, Reinforcement learning in robotics: a survey, Int. J. Rob. Res., 32 (2013), 1238–1274. https://doi.org/10.1177/0278364913495721 doi: 10.1177/0278364913495721 |
[18] | A. Mosavi, Y. Faghan, P. Ghamisi, P. Duan, S. F. Ardabili, E. Salwana, et al., Comprehensive review of deep reinforcement learning methods and applications in economics, Mathematics, 8 (2020), 1640. https://doi.org/10.3390/math8101640 doi: 10.3390/math8101640 |
[19] | V. Shalumov, Cooperative online guide-launch-guide policy in a target-missile-defender engagement using deep reinforcement learning, Aerosp. Sci. Technol., 104 (2020), 105996. https://doi.org/10.1016/j.ast.2020.105996 doi: 10.1016/j.ast.2020.105996 |
[20] | S. He, H. S. Shin, A. Tsourdos, Computational missile guidance: a deep reinforcement learning approach, J. Aerosp. Inf. Syst., 18 (2021), 571–582. https://doi.org/10.2514/1.I010970 doi: 10.2514/1.I010970 |
[21] | B. Gaudet, R. Furfaro, Missile homing-phase guidance law design using reinforcement learning, in AIAA Guidance, Navigation, and Control Conference, (2012), 4470. https://doi.org/10.2514/6.2012-4470 |
[22] | D. Hong, S. Park, Avoiding obstacles via missile real-time inference by reinforcement learning, Appl. Sci., 12 (2022), 4142. https://doi.org/10.3390/app12094142 doi: 10.3390/app12094142 |
[23] | B. Gaudet, R. Furfaro, R. Linares, Reinforcement learning for angle-only intercept guidance of maneuvering targets, Aerosp. Sci. Technol., 99 (2020), 105746. https://doi.org/10.1016/j.ast.2020.105746 doi: 10.1016/j.ast.2020.105746 |
[24] | W. Li, Y. Zhu, D. Zhao, Missile guidance with assisted deep reinforcement learning for head-on interception of maneuvering target, Complex Intell. Syst., 8 (2022), 1205–1216. https://doi.org/10.1007/s40747-021-00577-6 doi: 10.1007/s40747-021-00577-6 |
[25] | X. Qiu, C. Gao, W. Jing, Maneuvering penetration strategies of ballistic missiles based on deep reinforcement learning, Proc. Inst. Mech. Eng., Part G: J. Aerosp. Eng., 2022 (2022), 09544100221088361. |
[26] | B. Gaudet, R. Furfaro, Integrated and adaptive guidance and control for endoatmospheric missiles via reinforcement learning, preprint, arXiv: 2109.03880. |
[27] | A. Candeli, G. de Tommasi, D. G. Lui, A. Mele, S. Santini, G. Tartaglione, A deep deterministic policy gradient learning approach to missile autopilot design, IEEE Access, 10 (2022), 19685–19696. https://doi.org/10.1109/ACCESS.2022.3150926 doi: 10.1109/ACCESS.2022.3150926 |
[28] | C. Yang, J. Wu, G. Liu, Y. Zhang, Ballistic missile maneuver penetration based on reinforcement learning, in 2018 IEEE CSAA Guidance, Navigation and Control Conference (CGNCC), (2018), 1–5. https://doi.org/10.1109/GNCC42960.2018.9018872 |
[29] | R. T. Yanushevsky, Modern Missile Guidance, CRC Press, 2018. https://doi.org/10.1201/9781351202954 |
[30] | A. Wong, F. Nitzsche, M. Khalid, Formulation of reduced-order models for blade-vortex interactions using modified volterra kernels, 2004. Available from: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.632.1295&rep=rep1&type=pdf. |
[31] | S. H. Hollingdale, The mathematics of collision avoidance in two dimensions, J. Navig., 14 (1961), 243–261. https://doi.org/10.1017/S037346330002960X doi: 10.1017/S037346330002960X |
[32] | Y. Chen, J. Wang, C. Wang, J. Shan, M. Xin, A modified cooperative proportional navigation guidance law, J. Franklin Inst., 356 (2019), 5692–5705. https://doi.org/10.1016/j.jfranklin.2019.04.013 doi: 10.1016/j.jfranklin.2019.04.013 |