This paper focuses on the adaptive reinforcement learning-based optimal control problem for standard nonstrict-feedback nonlinear systems with the actuator fault and an unknown dead zone. To simultaneously reduce the computational complexity and eliminate the local optimal problem, a novel neural network weight updated algorithm is presented to replace the classic gradient descent method. By utilizing the backstepping technique, the actor critic-based reinforcement learning control strategy is developed for high-order nonlinear nonstrict-feedback systems. In addition, two auxiliary parameters are presented to deal with the input dead zone and actuator fault respectively. All signals in the system are proven to be semi-globally uniformly ultimately bounded by Lyapunov theory analysis. At the end of the paper, some simulation results are shown to illustrate the remarkable effect of the proposed approach.
Citation: Zichen Wang, Xin Wang. Fault-tolerant control for nonlinear systems with a dead zone: Reinforcement learning approach[J]. Mathematical Biosciences and Engineering, 2023, 20(4): 6334-6357. doi: 10.3934/mbe.2023274
This paper focuses on the adaptive reinforcement learning-based optimal control problem for standard nonstrict-feedback nonlinear systems with the actuator fault and an unknown dead zone. To simultaneously reduce the computational complexity and eliminate the local optimal problem, a novel neural network weight updated algorithm is presented to replace the classic gradient descent method. By utilizing the backstepping technique, the actor critic-based reinforcement learning control strategy is developed for high-order nonlinear nonstrict-feedback systems. In addition, two auxiliary parameters are presented to deal with the input dead zone and actuator fault respectively. All signals in the system are proven to be semi-globally uniformly ultimately bounded by Lyapunov theory analysis. At the end of the paper, some simulation results are shown to illustrate the remarkable effect of the proposed approach.
[1] | J. B. Du, W. J. Cheng, G. Y. Lu, H. T. Gao, X. L. Chu, Z. C. Zhang, et al., Resource pricing and allocation in MEC enabled blockchain systems: An A3C deep reinforcement learning approach, IEEE Trans. Network Sci. Eng., 9 (2022), 33–44. https://10.1109/TNSE.2021.3068340 doi: 10.1109/TNSE.2021.3068340 |
[2] | H. X. Peng, X. M. Shen, Deep reinforcement learning based resource management for multi-access edge computing in vehicular networks, IEEE Trans. Network Sci. Eng., 7 (2021), 2416–2428. https://10.1109/TNSE.2020.2978856 doi: 10.1109/TNSE.2020.2978856 |
[3] | D. C. Chen, X. L. Liu, W. W. Yu, Finite-time fuzzy adaptive consensus for heterogeneous nonlinear multi-agent systems, IEEE Trans. Network Sci. Eng., 7 (2021), 3057–3066. https://10.1109/TNSE.2020.3013528 doi: 10.1109/TNSE.2020.3013528 |
[4] | J. Wang, Q. Wang, H. Wu, T. Huang, Finite-time consensus and finite-time $H_{\infty}$ consensus of multi-agent systems under directed topology, IEEE Trans. Network Sci. Eng., 7 (2020), 1619–1632. https://10.1109/TNSE.2019.2943023 doi: 10.1109/TNSE.2019.2943023 |
[5] | T. Gao, T. Li, Y. J. Liu, S. Tong, IBLF-based adaptive neural control of state-constrained uncertain stochastic nonlinear systems, IEEE Trans. Neural Networks Learn. Syst., 33 (2022), 7345–7356. https://10.1109/TNNLS.2021.3084820 doi: 10.1109/TNNLS.2021.3084820 |
[6] | T. T. Gao, Y. J. Liu, D. P. Li, S. C. Tong, T. S. Li, Adaptive neural control using tangent time-varying BLFs for a class of uncertain stochastic nonlinear systems with full state constraints, IEEE Trans. Cybern., 51 (2021), 1943–1953. https://10.1109/TCYB.2019.2906118 doi: 10.1109/TCYB.2019.2906118 |
[7] | P. J. Werbos, Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences, Ph. D. dissertation, Harvard University, 1974. |
[8] | Y. Tang, D. D. Zhang, P. Shi, W. B. Zhang, F. Qian, Event-based formation control for nonlinear multiagent systems under DOS attacks, IEEE Trans. Autom. Control., 66 (2021) 452–459. https://10.1109/TAC.2020.2979936 doi: 10.1109/TAC.2020.2979936 |
[9] | Y. Tang, X. T. Wu, P. Shi, F. Qian, Input-to-state stability for nonlinear systems with stochastic impulses, Automatica, 113 (2020), 0005–1098. https://doi.org/10.1016/j.automatica.2019.108766 doi: 10.1016/j.automatica.2019.108766 |
[10] | X. T. Wu, Y. Tang, J. D. Cao, X. R. Mao, Stability analysis for continuous-time switched systems with stochastic switching signals, IEEE Trans. Autom. Control., 63 (2018), 3083–3090. https://10.1109/TAC.2017.2779882 doi: 10.1109/TAC.2017.2779882 |
[11] | B. Kiumarsi, K. G. Vamvoudakis, H. Modares, F. L. Lewis, Optimal and autonomous control using reinforcement learning: A survey, IEEE Trans. Neural Networks Learn. Syst., 29 (2018), 2042–2062. https://10.1109/TNNLS.2017.2773458 doi: 10.1109/TNNLS.2017.2773458 |
[12] | V. Narayanan, S. Jagannathan, Event-triggered distributed control of nonlinear interconnected systems using online reinforcement learning with exploration, IEEE Trans. Cybern., 48 (2018), 2510–2519. https://10.1109/TCYB.2017.2741342 doi: 10.1109/TCYB.2017.2741342 |
[13] | B. Luo, H. N. Wu, T. Huang, Off-policy reinforcement learning for H$\infty$ control design, IEEE Trans. Cybern., 45 (2015), 65–76. https://10.1109/TCYB.2014.2319577 doi: 10.1109/TCYB.2014.2319577 |
[14] | R. Song, F. L. Lewis, Q. Wei, Off-policy integral reinforcement learning method to solve nonlinear continuous-time multiplayer nonzero sum games, IEEE Trans. Neural Networks Learn. Syst., 28 (2017), 704–713. https://10.1109/TNNLS.2016.2582849 doi: 10.1109/TNNLS.2016.2582849 |
[15] | X. Yang, D. Liu, B. Luo, C. Li, Data-based robust adaptive control for a class of unknown nonlinear constrained-input systems via integral reinforcement learning, Inf. Sci., 369 (2016), 731–747. https://doi.org/10.1016/j.ins.2016.07.051 doi: 10.1016/j.ins.2016.07.051 |
[16] | H. Zhang, K. Zhang, Y. Cai, J. Han, Adaptive fuzzy fault-tolerant tracking control for partially unknown systems with actuator faults via integral reinforcement learning method, IEEE Trans. Fuzzy Syst., 27 (2019), 1986–1998. https://10.1109/TFUZZ.2019.2893211 doi: 10.1109/TFUZZ.2019.2893211 |
[17] | W. Bai, Q. Zhou, T. Li, H. Li, Adaptive reinforcement learning neural network control for uncertain nonlinear system with input saturation, IEEE Trans. Cybern., 50 (2020), 3433–3443. https://10.1109/TCYB.2019.2921057 doi: 10.1109/TCYB.2019.2921057 |
[18] | Y. Li, S. Tong, Adaptive neural networks decentralized FTC design for nonstrict-feedback nonlinear interconnected large-scale systems against actuator faults, IEEE Trans. Neural Networks Learn. Syst., 28 (2017), 2541–2554. https://10.1109/TNNLS.2016.2598580 doi: 10.1109/TNNLS.2016.2598580 |
[19] | Q. Chen, H. Shi, M. Sun, Echo state network-based backstepping adaptive iterative learning control for strict-feedback systems: An error-tracking approach, IEEE Trans. Cybern., 50 (2020), 3009–3022. https://10.1109/TCYB.2019.2931877 doi: 10.1109/TCYB.2019.2931877 |
[20] | S. Tong, Y. Li, S. Sui, Adaptive fuzzy tracking control design for SISO uncertain nonstrict feedback nonlinear systems, IEEE Trans. Fuzzy Syst., 24 (2016), 1441–1454. https://10.1109/TFUZZ.2016.2540058 doi: 10.1109/TFUZZ.2016.2540058 |
[21] | W. Bai, T. Li, S. Tong, NN reinforcement learning adaptive control for a class of nonstrict-feedback discrete-time systems, IEEE Trans. Cybern., 50 (2020), 4573–4584. https://10.1109/TCYB.2020.2963849 doi: 10.1109/TCYB.2020.2963849 |
[22] | Y. Li, K. Sun, S. Tong, Observer-based adaptive fuzzy fault-tolerant optimal control for SISO nonlinear systems, IEEE Trans. Cybern., 49 (2019), 649–661. https://10.1109/TCYB.2017.2785801 doi: 10.1109/TCYB.2017.2785801 |
[23] | H. Modares, F. L. Lewis, M. B. Naghibi-Sistani, Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems, Automatica, 50 (2014), 193–202. https://doi.org/10.1016/j.automatica.2013.09.043 doi: 10.1016/j.automatica.2013.09.043 |
[24] | Z. Wang, L. Liu, Y. Wu, H. Zhang, Optimal fault-tolerant control for discrete-time nonlinear strict-feedback systems based on adaptive critic design, IEEE Trans. Neural Networks Learn. Syst., 29 (2018), 2179–2191. https://10.1109/TNNLS.2018.2810138 doi: 10.1109/TNNLS.2018.2810138 |
[25] | H. Li, Y. Wu, M. Chen, Adaptive fault-tolerant tracking control for discrete-time multiagent systems via reinforcement learning algorithm, IEEE Trans. Cybern., 51 (2021), 1163–1174. https://10.1109/TCYB.2020.2982168 doi: 10.1109/TCYB.2020.2982168 |
[26] | Y. J. Liu, L. Tang, S. Tong, C. L. P. Chen, D. J. Li, Reinforcement learning design-based adaptive tracking control with less learning parameters for nonlinear discrete-time MIMO systems, IEEE Trans. Neural Networks Learn. Syst., 26 (2015), 165–176. https://10.1109/TNNLS.2014.2360724 doi: 10.1109/TNNLS.2014.2360724 |
[27] | W. Bai, T. Li, Y. Long, C. L. P. Chen, Event-triggered multigradient recursive reinforcement learning tracking control for multiagent systems, IEEE Trans. Neural Networks Learn. Syst., 34 (2023), 355–379. https://10.1109/TNNLS.2021.3094901 doi: 10.1109/TNNLS.2021.3094901 |
[28] | H. Wang, G. H. Yang, A finite frequency domain approach to fault detection for linear discrete-time systems, Int. J. Control., 81 (2008), 1162–1171. https://doi.org/10.1080/00207170701691513 doi: 10.1080/00207170701691513 |
[29] | C. Tan, G. Tao, R. Qi, A discrete-time parameter estimation based adaptive actuator failure compensation control scheme, Int. J. Control., 86 (2013), 276–289. https://doi.org/10.1080/00207179.2012.723828 doi: 10.1080/00207179.2012.723828 |
[30] | J. Na, X. Ren, G. Herrmann, Z. Qiao, Adaptive neural dynamic surface control for servo systems with unknown dead-zone, Control Eng. Pract., 19 (2011), 1328–1343. https://doi.org/10.1016/j.conengprac.2011.07.005 doi: 10.1016/j.conengprac.2011.07.005 |
[31] | Y. J. Liu, S. Li, S. Tong, C. L. P. Chen, Adaptive reinforcement learning control based on neural approximation for nonlinear discrete-time systems with unknown nonaffine dead-zone input, IEEE Trans. Neural Networks Learn. Syst., 30 (2019), 295–305. https://10.1109/TNNLS.2018.2844165 doi: 10.1109/TNNLS.2018.2844165 |
[32] | S. S. Ge, J. Zhang, T. H. Lee, Adaptive neural network control for a class of MIMO nonlinear systems with disturbances in discrete time, IEEE Trans. Syst., Man Cybern. B, Cybern., 34 (2004), 1630–1645. https://10.1109/TSMCB.2004.826827 doi: 10.1109/TSMCB.2004.826827 |
[33] | Y. J. Liu, Y. Gao, S. Tong, Y. Li, Fuzzy approximation-based adaptive backstepping optimal control for a class of nonlinear discrete time systems with dead-zone, IEEE Trans. Fuzzy Syst., 24 (2016), 16–28. https://10.1109/TFUZZ.2015.2418000 doi: 10.1109/TFUZZ.2015.2418000 |
[34] | S. S. Ge, G. Y. Li, T. H. Lee, Adaptive NN control for a class of strict-feedback discrete-time nonlinear systems, Automatica, 39 (2003), 807–819. https://doi.org/10.1016/S0005-1098(03)00032-3 doi: 10.1016/S0005-1098(03)00032-3 |
[35] | Q. Yang, S. Jagannathan, Reinforcement learning controller design for affine nonlinear discrete-time systems using online approximators, IEEE Trans. Syst., Man, Cybern. B, Cybern., 42 (2012), 377–390. https://10.1109/TSMCB.2011.2166384 doi: 10.1109/TSMCB.2011.2166384 |
[36] | Y. J. Liu, Y. Gao, S. Tong, Y. Li, Fuzzy approximation-based adaptive backstepping optimal control for a class of nonlinear discrete time systems with dead-zone, IEEE Trans. Fuzzy Syst., 24 (2016), 16–28. https://10.1109/TFUZZ.2015.2418000 doi: 10.1109/TFUZZ.2015.2418000 |
[37] | S. Ferrari, J. E. Steck, R. Chandramohan, Adaptive feedback control by constrained approximate dynamic programming, IEEE Trans. Syst., Man, Cybern. B, Cybern., 38 (2008), 982–987. https://10.1109/TSMCB.2008.924140 doi: 10.1109/TSMCB.2008.924140 |
[38] | S. Tong, Y. Li, S. Sui, Adaptive fuzzy tracking control design for SISO uncertain nonstrict feedback nonlinear systems, IEEE Trans. Fuzzy Syst., 24 (2016), 1441–1454. https://10.1109/TFUZZ.2016.2540058 doi: 10.1109/TFUZZ.2016.2540058 |