A novel event-triggered constrained control for nonlinear discrete-time systems

Yuanyuan Cheng; Yuan Li; Yuanyuan Cheng; Yuan Li

doi:10.3934/math.20231046

AIMS Mathematics

2023, Volume 8, Issue 9: 20530-20545. doi: 10.3934/math.20231046

Previous Article Next Article

Research article

A novel event-triggered constrained control for nonlinear discrete-time systems

Yuanyuan Cheng ,
Yuan Li ^,

School of Science, Shenyang University of Technology, Shenyang 110870, China

Received: 19 April 2023 Revised: 06 June 2023 Accepted: 15 June 2023 Published: 26 June 2023
MSC : 93C05, 93C41, 93C55, 93E20

In this paper, a novel event-triggered optimal control method is developed for nonlinear discrete-time systems with constrained inputs. First, a non-quadratic utility function is constructed to overcome the challenge caused by saturating actuators. Second, a novel triggering condition is designed to reduce computational burden. Difference from other triggering conditions, fewer assumptions are required to guarantee asymptotic stability. Then, the optimal cost function and control law are obtained by constructing the action-critic network. Convergence analysis of the system is provided in the consideration of the system state and neural network weight estimation errors. Finally, the effectiveness and correctness of the proposed method are verified by two numerical examples.

Keywords:

action-dependent dual heuristic programming,
nonlinear system,
event-triggered control,
saturating actuators

Citation: Yuanyuan Cheng, Yuan Li. A novel event-triggered constrained control for nonlinear discrete-time systems[J]. AIMS Mathematics, 2023, 8(9): 20530-20545. doi: 10.3934/math.20231046

Related Papers:

[1]	Le You, Chuandong Li, Xiaoyu Zhang, Zhilong He . Edge event-triggered control and state-constraint impulsive consensus for nonlinear multi-agent systems. AIMS Mathematics, 2020, 5(5): 4151-4167. doi: 10.3934/math.2020266
[2]	Zuo Wang, Hong Xue, Yingnan Pan, Hongjing Liang . Adaptive neural networks event-triggered fault-tolerant consensus control for a class of nonlinear multi-agent systems. AIMS Mathematics, 2020, 5(3): 2780-2800. doi: 10.3934/math.2020179
[3]	Kairui Chen, Yongping Du, Shuyan Xia . Adaptive state observer event-triggered consensus control for multi-agent systems with actuator failures. AIMS Mathematics, 2024, 9(9): 25752-25775. doi: 10.3934/math.20241258
[4]	Hongjie Li . H-infinity bipartite consensus of multi-agent systems with external disturbance and probabilistic actuator faults in signed networks. AIMS Mathematics, 2022, 7(2): 2019-2043. doi: 10.3934/math.2022116
[5]	Liping Luo, Yonggang Chen, Jishen Jia, Kaixin Zhao, Jinze Jia . Event-triggered anti-windup strategy for time-delay systems subject to saturating actuators. AIMS Mathematics, 2024, 9(10): 27721-27738. doi: 10.3934/math.20241346
[6]	Biwen Li, Yujie Liu . Quasi-synchronization of nonlinear systems with parameter mismatch and time-varying delays via event-triggered impulsive control. AIMS Mathematics, 2025, 10(2): 3759-3778. doi: 10.3934/math.2025174
[7]	Linni Li, Jin-E Zhang . Input-to-state stability of nonlinear systems with delayed impulse based on event-triggered impulse control. AIMS Mathematics, 2024, 9(10): 26446-26461. doi: 10.3934/math.20241287
[8]	Tao Xie, Xing Xiong . Finite-time synchronization of fractional-order heterogeneous dynamical networks with impulsive interference via aperiodical intermittent control. AIMS Mathematics, 2025, 10(3): 6291-6317. doi: 10.3934/math.2025287
[9]	Jiaqi Liang, Zhanheng Chen, Zhiyong Yu, Haijun Jiang . Fixed-time consensus of second-order multi-agent systems based on event-triggered mechanism under DoS attacks. AIMS Mathematics, 2025, 10(1): 1501-1528. doi: 10.3934/math.2025070
[10]	Chao Ma, Tianbo Wang, Wenjie You . Master-slave synchronization of Lurie systems with time-delay based on event-triggered control. AIMS Mathematics, 2023, 8(3): 5998-6008. doi: 10.3934/math.2023302

Abstract

1. Introduction

Optimality is one of the most significant properties of a control system. Generally, the framework of the Hamilton-Jacobi-Bellman (HJB) equation is applied to solve the optimal control problem. Nevertheless, it is formidable to obtain its analytical solutions. Therefore, adaptive dynamic programming (ADP) method have been widely used to approximate its numerical solutions ^[1,2,3,4]. With the deepening of the research, ADP has showed great development potential.

However, energy loss is the focus of today's industrial development with the resource consumption and increasing energy depletion. The event-triggered technique can greatly reduce the transmission and update of information. As an advanced sampling method, the essence of the event-triggered mechanism is to decide the controller update by choosing an appropriate triggering condition, which achieves the purpose of saving energy ^[5,6,7]. Wang et al. designed a novel adaptive event-triggering condition, solving the event-triggered control problem for discrete-time nonlinear systems ^[8]. Wei et al. studied the self-learning optimal regulation problem of discrete-time nonlinear systems based on events and proved that a suitable triggering condition can ensure the stability of the system ^[9]. Event-triggered control is also widely applied in tracking control problems ^{[10,11,12,13]} and other fields ^[14]. Hu et al. developed an event-based approximate optimal tracking control problem of discrete-time nonlinear systems ^[15]. Luo et al. introduced a novel event-triggered control policy and gave detail Lyapunov analysis for continuous-time systems ^[16].

Besides, due to the wide existence of physical constraints, practical systems are inevitably subject to saturation nonlinearities. Control constraints can easily damage the overall performance of the system. Additionally, it is more difficult to design the controller than the general case. Therefore, there is a great interest in the study of various systems with control constraints ^[17,18,19]. Ha et al. solved the constrained control problem by minimizing a novel nonquadratic cost function ^[20]. Ha et al. investigated an event-based controller for the near-optimal control policy of discrete-time systems with constrained inputs ^[21]. For the asymmetric input constraint problem, Sun et al. developed an event-triggered optimal control method ^[22]. Liu et al. designed a novel triggering condition with simple form and few assumptions, solving the optimal control problem by using the heuristic dynamic programming (HDP) algorithm ^[23]. Considering the constrained-input problem, Liao et al. proposed an event-triggered dual heuristic dynamic programming (DHP) algorithm ^[24]. Mu et al. applied the global dual heuristic dynamic programming (GDHP) algorithm to solve the event-triggered constraint control of nonlinear discrete-time systems ^[25]. Compared with the HDP and DHP structures, the action-dependent dual Heuristic programming (ADDHP) structure learns more system information, which enables the ADDHP method to obtain better control performance. This has motivated our study.

Given that the ADDHP algorithm has many advantages, we investigated a novel event-triggered control method using this algorithm. The main contributions of this paper are listed as follows:

(1) A novel triggering condition is designed, which can effectively reduce the number of events occurring. Additionally, under this triggering condition, the stability of the system is proved with fewer assumptions. Hence, the novel event-based ADDHP algorithm is more practical for application.

(2) The convergence for the cost function and control inputs is proved theoretically.

(3) In the action-critic network, the influence of the control input on the cost function is considered. Thus, this method has a faster convergence rate and a higher approximate accuracy.

This paper is arranged as follows: Section 2 states the event-triggered constrained control problem. A novel triggering condition and the stability analysis of the system are provided in Section 3. Section 4 briefly introduces the implementation of the ADDHP algorithm and analyzes the convergence of the system states and neural network weights. In Section 5, two simulation examples are presented to verify the correctness of the proposed algorithm. Finally, some conclusions and the prospects for the future are given in Section 6.

2. Problem description

Consider the following nonlinear discrete-time system with constrained inputs:

$\begin{equation} x\left({k+1}\right) = F\left({x\left(k\right),u\left(k\right)}\right),k = 0,1,2,\cdots, \end{equation}$

(2.1)

where $x(k)\in{R^n}$ is the state vector, $u(k)\in{R^m}$ is the control input, $F\left({\cdot, \cdot}\right)$ is an unknown system function. ${\Omega_u} = \left\{{u\left(k\right)\left|{u\left(k\right) = {{\left[{{u_1}\left(k\right), {u_2}\left(k\right), \cdots, {u_m}\left(k\right)}\right]}^T}\in{R^m}, \left|{{u_j}\left(k\right)}\right|\le{{\bar u}_j}, j = 1, 2, \cdots, m}\right.}\right\}$ , where ${\bar u_j}$ is the saturation level of the jth actuator. The origin $x\left(k\right) = 0$ is the unique equilibrium point of the system (2.1) under $u\left(k\right) = 0$ , i.e., $F\left({0, 0}\right) = 0$ .

Assumption 1. ^[23] System (2.1) is controllable and observable, unknown system function $F:{R^n}\times{R^m}\to{R^n}$ is Lipschitz continuous.

Assumption 1 implies that there exists a continuous state feedback control policy $u\left(k\right) = \mu\left({x\left(k\right)}\right), \mu:{R^n}\to{R^m}$ that can stabilize system (2.1) to the equilibrium point.

In the event-triggered control, we define a monotone increasing time sequence $\left\{{{k_i}}\right\}_{i = 0}^\infty$ as sampling sequence. When the triggering condition is satisfied, the control input keeps constant during the time interval $\left[{{k_i}, {k_{i+1}}}\right)$ by involving a zero-order hold (ZOH). Therefore, the feedback control law can be expressed as

$\begin{equation} u\left({x\left(k\right)}\right) = \mu\left({x\left({{k_i}}\right)}\right). \end{equation}$

(2.2)

Due to a gap or difference between the sampling state $x\left({{k_i}}\right)$ and the current state $x\left(k\right)$ , then the triggering error is described as

$\begin{equation} e\left(k\right) = x\left({{k_i}}\right)-x\left(k\right). \end{equation}$

(2.3)

Only when $e\left(k\right) = 0$ , i.e., $x\left(k\right) = x\left({{k_i}}\right), i = 0, 1, 2, \cdots$ , the current status is marked as the sampling status and transferred to the controller to update the system control law. The control law can be rewritten as $u\left({x\left(k\right)}\right) = \mu\left({x\left(k\right)+e\left(k\right)}\right)$ , so system (2.1) can be rewritten as

$\begin{equation} x\left({k+1}\right) = F\left({x\left(k\right),\mu\left({x\left(k\right)+e\left(k\right)}\right)}\right). \end{equation}$

(2.4)

The utility function is described as

$\begin{equation} \begin{aligned} U\left( {x\left( k \right),\mu \left( {x\left( {{k_i}} \right)} \right)} \right)& = {x^T}\left( k \right)Qx\left( k \right) + T\left( {\mu \left( {x\left( {{k_i}} \right)} \right)} \right)\\ & = {x^T}\left( k \right)Qx\left( k \right) + 2\int_0^{\mu \left( {x\left( {{k_i}} \right)} \right)} {{{\tanh }^{ - T}}\left( {{{\bar U}^{ - 1}}v} \right)\bar URdv}, \end{aligned} \end{equation}$

(2.5)

where $Q\in{R^{n\times n}}$ and $R$ are symmetric positive definite matrices, $T\left({\mu \left({x\left({{k_i}} \right)} \right)} \right)$ is a positive non-quadratic function and can ensure that the control input $\mu\left({x\left({{k_i}}\right)}\right)$ does not exceed the constraint boundary. $\bar U\in{R^{m\times m}}$ is a constant diagonal matrix by $\bar U = diag\left\{{{{\bar u}_1}, {{\bar u}_2}, \cdots, {{\bar u}_m}}\right \}$ .

The purpose of optimal control is to search for an optimal control strategy ${\mu^*}\left({x\left({{k_i}}\right)}\right)$ to minimize the cost function:

$\begin{equation} J(x(k)) = \sum\limits_{i = k}^\infty{U(x(i),\mu(x({k_i})))}. \end{equation}$

(2.6)

For the cost function $J(x(k))$ , its Hamiltonian function is expressed as

$\begin{equation} H\left({x,\mu,\nabla J}\right) = U(x(i),\mu(x({k_i})))+\nabla {J^T}\left(x\right)F\left({x,u}\right), \end{equation}$

(2.7)

where $\nabla J\left(\cdot\right) = {{\partial J\left(\cdot\right)}/{\partial x\left(\cdot\right)}}$ . According to Bellman's optimality principle, the optimal cost function ${J^*}\left({x\left({{k_i}}\right)}\right)$ can be gained by solving the following HJB equation:

$\begin{equation} \mathop{\min}\limits_{\mu\in{\Omega_u}}H\left({x,\mu,\nabla {J^*}}\right) = 0, \end{equation}$

(2.8)

where $\nabla {J^*}\left(0\right) = 0$ , the optimal control law can be expressed as

$\begin{equation} \mu^*\left({x\left({{k_i}}\right)}\right) = \arg\mathop{\min}\limits_{\mu\in{\Omega_u}}H\left({x,\mu,\nabla {J^*}}\right). \end{equation}$

(2.9)

In the following section, we will prove that the system is asymptotically stable under the designed triggering condition.

3. Triggering condition and stability analysis

Design the triggering condition in the following form:

$\begin{equation} \begin{split} \left\| {e\left( k \right)} \right\| \le {e_T} = \sqrt {\frac{{1 - \alpha {C^2}}}{{2{C^2}}}} \left\| {x\left( {{k_i}} \right)} \right\|, \end{split} \end{equation}$

(3.1)

where $C\in \left({0, {1 / {\sqrt \alpha }}} \right)$ and $\alpha\in\left({2, {1/{{C^2}}}}\right)$ are normal numbers. The triggering threshold ${e_T}$ is not unique, which is influenced by the system sampling status ${x\left({{k_i}} \right)}$ and the designed constants $\alpha$ and $C$ . Then the next triggering point can be achieved by

$\begin{equation} {k_{i+1}} = \inf\left\{{k\left|{\left\|{e\left(k\right)}\right\| > {e_T},k > {k_i}}\right.}\right\}. \end{equation}$

(3.2)

For discrete-time systems, the minimal inter-sample time is bounded by a nonzero positive constant, then Zeno behavior can be eliminated.

Remark 1. The threshold has a similar form to that proposed in ^[23]. This paper introduces the parameter $\alpha$ that interacts with $C$ . By adjusting these two parameters, the novel triggering condition can achieve higher resource utilization efficiency. It will be shown in the simulation example later. Compared with ^[24] and ^[25], the triggering condition designed in this paper is easy to implement and requires fewer assumptions.

Definition 1. ^[23] There exist some ${\kappa_\infty }$ function ${\alpha_1}, {\alpha_2}, {\alpha_3}$ and a $\kappa$ function $\beta$ , which make the following inequality hold:

$\begin{equation} {\alpha _1}\left( {\left\| {x\left( k \right)} \right\|} \right) \le V\left( {x\left( k \right)} \right) \le {\alpha _2}\left( {\left\| {x\left( k \right)} \right\|} \right), \end{equation}$

(3.3)

$\begin{equation} V\left( {F\left( {x\left( k \right),\mu \left( {x\left( k \right) + e\left( k \right)} \right)} \right)} \right) - V\left( {x\left( k \right)} \right) \le - {\alpha _3}\left( {\left\| {x\left( k \right)} \right\|} \right) + \beta \left( {\left\| {e\left( k \right)} \right\|} \right), \end{equation}$

(3.4)

then the function $V:{R^n}\to R$ is called an input-to-state stability (ISS) Lyapunov function.

Assumption 2. ^[25] There exists a normal number $C\in\left({0, {1/{\sqrt\alpha}}}\right)$ , which makes the following inequality hold:

$\begin{equation} F\left( {x\left( k \right),\mu \left( {x\left( k \right) + e\left( k \right)} \right)} \right) \le C\left\| {x\left( k \right)} \right\| + C\left\| {e\left( k \right)} \right\|. \end{equation}$

(3.5)

Theorem 3.1. Suppose that Assumptions 1 and 2 hold and the triggering condition is determined by (3.1), then the nonaffine system (2.4) is asymptotically stable.

Proof. Define the following Lyapunov function:

$\begin{equation} V\left({x\left({k+1}\right)}\right) = {x^T}\left({k+1}\right)Qx\left({k+1}\right)+T\left({\mu\left({x\left({{k_i}}\right)}\right)}\right). \end{equation}$

(3.6)

For the case of $k\in\left[{{k_i}, {k_{i+1}}}\right)$ , the control law stored in ZOH updates the system. The Lyapunov function is only related to the system state.

The first-order difference of $V$ is

$\begin{equation} \begin{aligned} \Delta V\left( {x\left( {k + 1} \right)} \right)& = {x^T}\left( {k + 1} \right)Qx\left( {k + 1} \right) - {x^T}\left( k \right)Qx\left( k \right)\\ & = {\lambda _{\min }}\left( Q \right)\left[ {{{\left\| {x\left( {k + 1} \right)} \right\|}^2} - {{\left\| {x\left( k \right)} \right\|}^2}} \right]. \end{aligned} \end{equation}$

(3.7)

Define ${\alpha_3}\left({\left\|{x\left(k\right)}\right\|}\right) = {\lambda_{\min}}\left(Q\right)\left({1-2{C^2}}\right){\left\|{x\left(k\right)}\right\|^2}$ and $\beta\left({\left\|{e\left(k\right)}\right\|}\right) = {\lambda_{\min}}\left(Q\right)2{C^2}{\left\|{e\left(k\right)}\right\| ^2}$ , according to Assumption 2 and the Cauchy-Schwarz inequality, we can deduce that

$\begin{equation} \begin{aligned} \Delta V\left( {x\left( {k + 1} \right)} \right)& = {\lambda _{\min }}\left( Q \right)\left[ {{{\left( {C\left\| {x\left( k \right)} \right\| + C\left\| {e\left( k \right)} \right\|} \right)}^2} - {{\left\| {x\left( k \right)} \right\|}^2}} \right]\\ & \le {\lambda _{\min }}\left( Q \right)\left[ {\left( {2{C^2} - 1} \right){{\left\| {x\left( k \right)} \right\|}^2} + 2{C^2}{{\left\| {e\left( k \right)} \right\|}^2}} \right]\\ & = - {\alpha _3}\left( {\left\| {x\left( k \right)} \right\|} \right) + \beta \left( {\left\| {x\left( k \right)} \right\|} \right). \end{aligned} \end{equation}$

(3.8)

According to Definition 1, $V$ is an ISS Lyapunov function. Substitute the triggering condition to (3.8), we can obtain

$\begin{equation} \begin{aligned} \Delta V\left( {x\left( {k + 1} \right)} \right)& \le {\lambda _{\min }}\left( Q \right)\left[ {\left( {2{C^2} - 1} \right){{\left\| {x\left( k \right)} \right\|}^2} + \left( {1 - \alpha {C^2}} \right){{\left\| {x\left( k \right)} \right\|}^2}} \right]\\ &\le {\lambda _{\min }}\left( Q \right)\left( {2 - \alpha } \right){C^2}{\left\| {x\left( k \right)} \right\|^2}. \end{aligned} \end{equation}$

(3.9)

Since $\alpha \in \left({2, {1 / {{C^2}}}} \right)$ , then $\Delta V \le 0$ . Therefore, the system (2.4) based on events is asymptotically stable. □

Remark 2. Compared with ^[24] and ^[25], this paper demonstrates stability with fewer conditions. When the triggering condition is violated at the time instant $k+1$ , the system will work under the updated control law, which is equivalent to the time-triggered control at $k+1$ . According to the optimal control theory, stability can be guaranteed in this single instant.

4. Event-triggered control with the ADDHP technique

Utilizing the advantages of neural networks, three networks are established to approximate the system dynamics, costate function and control law respectively. Moreover, the event-triggered technique is introduced to lessen the communication bandwidth. The simple diagram of the ETOC scheme is illustrated in Figure 1.

Figure 1. The diagram of the event-based ADDHP approach.

DownLoad: Full-Size Img PowerPoint

For simple representation, we define some notations before presenting the main results. The weight matrix of the input-to-hidden layer is expressed as $w$ , and the weight matrix of the hidden-to-output layer is denoted as $v$ . Set the activation function as $\vartheta\left(t\right) = {{\left({1-{e^{-t}}}\right)}/{\left({1+{e^{-t}}}\right)}}$ , $\zeta$ and $\eta$ represent the approximation error and the learning rate respectively.

4.1. Model network

The model network is employed to identify the system dynamics $x(k+1)$ . Then $x(k+1)$ can be represented as

$\begin{equation} x(k+1) = w{}_m^T{\vartheta_m}({\sigma_{mk}})+{\zeta_{mk}}, \end{equation}$

(4.1)

where ${\sigma_{mk}} = v_m^T{\theta_k}$ , ${\theta_k} = {\left[{{x^T}\left(k\right), {u^T}\left(k\right)}\right]^T}$ is the input vector. Considering that the optimal weight vector ${w_m}$ is usually unknown, we approximate the optimal weight vector ${w_m}$ with ${\hat w_m}$ , then the system state can be estimated as

$\begin{equation} \hat x(k+1) = \hat w_m^T{\vartheta_m}({\sigma_{mk}}). \end{equation}$

(4.2)

The error function of the model network can be denoted as ${e_m} = \hat x(k+1)-x(k+1)$ , the objective performance function ${E_m}$ can be defined as

$\begin{equation} {E_m} = \frac{1}{2}e_m^T{e_m}. \end{equation}$

(4.3)

We apply the gradient descent algorithm to update ${\hat w_m}$ :

$\begin{equation} {\hat w_{m(k + 1)}} = {\hat w_{mk}} - {\eta _m}\frac{{\partial {E_m}}}{{\partial {{\hat w}_{mk}}}}, \end{equation}$

(4.4)

$\begin{equation} \begin{split} \frac{{\partial {E_m}}}{{\partial {{\hat w}_{mk}}}} = \frac{{\partial {E_m}}}{{\partial {e_m}}}\frac{{\partial {e_m}}}{{\partial \hat x\left( {k + 1} \right)}}\frac{{\partial \hat x\left( {k + 1} \right)}}{{\partial {{\hat w}_{mk}}}} = {e_m}{\vartheta _m}\left( {{\sigma _{mk}}} \right). \end{split} \end{equation}$

(4.5)

4.2. Critic network

The critic network is used to approximate the costate function, which can be described as

$\begin{equation} {\hat\lambda^{(i+1)}}(x\left({k+1}\right)) = \hat w_c^T{\vartheta_c}({z_{c(k+1)}}), \end{equation}$

(4.6)

where ${z_{c(k+1)}} = v_c^T{\pi_{k+1}}$ , ${\pi_{k+1}} = {\left[{{{\hat x}^T}\left({k+1}\right), {{\hat u}^T}\left({k+1}\right)}\right]^T}$ represents the input vector, and $\hat\lambda\left({x\left({k+1}\right)}\right) = {{\partial\hat J\left({x\left({k+1}\right)}\right)} / {\partial x\left({k+1}\right)}}$ , $\hat\lambda(x\left({k+1}\right))$ is the estimation of $\lambda(x\left({k+1}\right))$ .

We define the error function of the critic network as ${e_c} = {\hat\lambda^{(i+1)}}(x\left({k+1}\right))-{\lambda^{(i+1)}}(x\left({k+1}\right))$ . The critic network is supposed to minimize the performance measure ${E_c} = \frac{1}{2}e_c^T{e_c}$ .

The weight tuning law is designed to obey a gradient-descent algorithm:

$\begin{equation} {\hat w_{c(k+1)}} = {\hat w_{ck}}-{\eta_c}\frac{{\partial{E_c}}}{{\partial{{\hat w}_{ck}}}}, \end{equation}$

(4.7)

$\begin{equation} \begin{split} \frac{{\partial {E_c}}}{{\partial {{\hat w}_{ck}}}} = \frac{{\partial {E_c}}}{{\partial {e_c}}}\frac{{\partial {e_c}}}{{\partial \hat \lambda \left( {x\left( {k + 1} \right)} \right)}}\frac{{\partial \hat \lambda \left( {x\left( {k + 1} \right)} \right)}}{{\partial {{\hat w}_{ck}}}} = {e_c}{\vartheta _c}\left( {{z_{c(k + 1)}}} \right). \end{split} \end{equation}$

(4.8)

4.3. Action network

The input of the action network is the sampling state $x\left({{k_i}}\right)$ , which is used to obtain the control law $\mu\left({x\left({{k_i}}\right)}\right)$ . Then $\mu\left({x\left({{k_i}}\right)}\right)$ can be estimated as

$\begin{equation} \hat\mu(x\left({{k_i}}\right)) = \hat w_a^T{\vartheta_a}({\varsigma_{ak}}), \end{equation}$

(4.9)

where ${\varsigma_{ak}} = v_a^Tx\left({{k_i}}\right)$ . Define the error function as ${e_a} = {\hat\lambda^{(i+1)}}(x\left({k+1}\right))-{J_C}$ , where ${J_C} = 0$ expresses the desired ultimate targets, and is set to 0, generally. Thus, the target performance measure can be designed as ${E_a} = \frac{1}{2}e_a^T{e_a}$ .

According to the gradient-descent algorithm, the weights can be updated as

$\begin{equation} \begin{split} {\hat w_{a(k + 1)}} = {\hat w_{ak}} - {\eta _a}\frac{{\partial {E_a}}}{{\partial {{\hat w}_{ak}}}}, \end{split} \end{equation}$

(4.10)

$\begin{equation} \begin{split} \frac{{\partial {E_a}}}{{\partial {{\hat w}_{ak}}}} = {e_a}\hat w_c^T{\varphi _{(k + 1)}}v_c^T\hat w_m^T{\psi _k}v_m^T\rho {\vartheta _a}\left( {{\varsigma _{ak}}} \right) = {\vartheta _a}\left( {{\varsigma _{ak}}} \right)\hat w_c^T{\vartheta _c}\left( {{z_{c(k + 1)}}} \right)\hat w_c^T{\varphi _{(k + 1)}}\varpi, \end{split} \end{equation}$

(4.11)

where $\varpi = v_c^T\hat w_m^T{\psi_k}v_m^T\rho$ , $\rho\in{R^{\left({n+m}\right)\times m}}, {\varphi_{k+1}}\in{R^{{h_c}\times{h_c}}}, {\psi_k}\in{R^{{h_m}\times{h_m}}}$ are represented as $\rho = \left[{\begin{array}{*{20}{c}}{{0_{n \times m}}}\\{{I_{m\times m}}}\end{array}}\right],$ ${\varphi_{k + 1}} = \frac{1}{2}\left[ {\begin{array}{*{20}{c}} {1 - \vartheta _c^2\left({{z_{c\left({k+1}\right), 1}}} \right)} & \cdots & {1 - \vartheta _c^2\left({{z_{c\left({k + 1} \right)}}, {h_c}} \right)}\\ \vdots & \ddots & \vdots \\ {1-\vartheta_c^2\left({{z_{c\left({k + 1} \right), 1}}} \right)} & \cdots & {1 - \vartheta _c^2\left({{z_{c\left({k + 1} \right)}}, {h_c}} \right)} \end{array}} \right],$ ${\psi_{k}} = \frac{1}{2}\left[{\begin{array}{*{20}{c}} {1 - \vartheta _m^2\left({{z_{mk, 1}}} \right)} & \cdots & {1 - \vartheta _m^2\left({{z_{mk}}, {h_m}} \right)}\\ \vdots & \ddots & \vdots \\ {1 - \vartheta _m^2\left({{z_{mk, 1}}} \right)} & \cdots & {1 - \vartheta _m^2\left({{z_{mk}}, {h_m}} \right)} \end{array}}\right]$ respectively. $\varpi$ will remain as a constant matrix after the model network is well-trained.

Remark 3. In the ADDHP algorithm, two action networks are constructed to approximate the control laws at the time instants $k$ and $k+1$ . The outputs of the second action network are used to approximate the costate function. The effect of the control input on the costate function is considered, then the ADDHP structure can learn more system information compared to HDP and DHP structure. Thus, the proposed approach has a higher approximate accuracy and a faster convergence rate.

4.4. Convergence analysis

Assumption 3. Assume that:

(1)The activation function $\vartheta$ and the reconstruction error $\zeta$ are bounded, such that $\left\| {{\vartheta_c}}\right\| \le {\vartheta_{cM}}, \left\| {{\vartheta_a}}\right\| \le {\vartheta_{aM}}, \left\| {{\zeta_{ck}}}\right\| \le {\zeta_{cM}}$ , where ${\vartheta_M}$ , ${\zeta_M}$ are positive constants.

(2) The optimal weight vectors $w$ and $v$ are bounded, i.e., $\left\|w\right\| < {w_M}$ , $\left\|v\right\| < {v_M}$ , where ${w_M}$ , ${v_M}$ are positive constants.

Owing to ${\xi_{ck}}$ , ${\xi_{ak}}$ , ${\varphi_{(k+1)}}$ are only related to the weight $w$ and the activation function $\vartheta$ , ${\xi_{ck}}$ and ${\xi_{ak}}$ are defined in the process later. Based on Assumption 3, it is certain that ${\xi_{ck}}$ , ${\xi_{ak}}$ , ${\varphi_{(k+1)}}$ are bounded. For simple representation, we apply ${\xi_{cM}}$ , ${\xi_{aM}}$ , ${\varphi_M}$ represent the upper of ${\xi_{ck}}$ , ${\xi_{ak}}$ , ${\varphi_{(k+1)}}$ respectively.

Defined the weight estimation errors of the action and critic networks as ${\tilde w_a} = {\hat w_a}-{w_a}$ and ${\tilde w_c} = {\hat w_c}-{w_c}$ respectively, which $\hat w$ represents the estimation weight and $w$ denotes the optimal weight.

Theorem 4.1. Supposed that Assumptions 1–3 hold and the triggering condition is determined by (3.1). The weight-updating laws of NNs are regulated by (4.7) and (4.10) respectively. Then the system states $x(k)$ and the weight estimation errors ${\tilde w_c}$ and ${\tilde w_a}$ are uniformly ultimately bounded (UUB) under the following conditions:

$\begin{equation} {\eta _c} < \frac{1}{{2{\vartheta _{cM}}}},\; {\eta _a} < \frac{1}{{2{\vartheta _{aM}}}},\; \left\| {x\left( k \right)} \right\| > \sqrt {\frac{{D_M^2}}{{{\lambda _{\min }}\left( Q \right)\left( {\alpha -2} \right){C^2}}}}, \end{equation}$

(4.12)

where $D_M^2 = F_M^2+\left({1+2{\eta_c}\vartheta_{_{cM}}^2}\right)\zeta_{_{cM}}^2$ .

Proof. The situation that the event is triggered at the time $k$ only needs to be considered. Because when the event is not triggered, control law $u(k)$ is not updated, then the associated weight vectors ${w_c}$ and ${w_a}$ keep unchanged. Therefore, the Lyapunov function is only related to the system state. According to Theorem 3.1, stability of the system can be guaranteed for all $k$ .

Define the Lyapunov function in the following form:

$\begin{equation} \begin{split} V\left( {x\left( {k + 1} \right)} \right) = {x^T}\left( k \right)x\left( k \right) + \frac{1}{{{\eta _c}}}tr\left\{ {\tilde w_c^T{{\tilde w}_c}} \right\} + \frac{1}{{{\eta _a}}}tr\left\{ {\tilde w_a^T{{\tilde w}_a}} \right\}. \end{split} \end{equation}$

(4.13)

Let ${L_1} = {x^T}\left(k \right)x\left(k \right)$ , ${L_2} = \dfrac{1}{{{\eta_c}}}\left\{{\tilde w_c^T{{\tilde w}_c}}\right\}$ , ${L_3} = \dfrac{1}{{{\eta_a}}}\left\{{\tilde w_a^T{{\tilde w}_a}}\right\}$ . The first-order difference of ${L_1}$ has been discussed in Theorem 3.1.

Based on the weight updating law, the weight estimation error of the critic network can be deduced as

$\begin{equation} {{\tilde w}_c}\left( {k + 1} \right) = {{\hat w}_c}\left( {k{\rm{ + 1}}} \right) - {w_c} = {{\hat w}_c}\left( k \right) - {\eta _c}\frac{{\partial {E_c}}}{{\partial {{\hat w}_{ck}}}} - {w_c} = {{\tilde w}_c}\left( k \right) - {\eta _c}{\vartheta _c}\left( {{{\rm{z}}_{c(k + 1)}}} \right){e_c}. \end{equation}$

(4.14)

Then the first-order difference of ${L_2}$ can be denoted as

$\begin{equation} \begin{split} \Delta {L_2}& = \frac{1}{{{\eta _c}}}tr\left\{ {\tilde w_c^T\left( {k + 1} \right){{\tilde w}_c}\left( {k + 1} \right) - \tilde w_c^T\left( k \right){{\tilde w}_c}\left( k \right)} \right\}\\ & = \frac{1}{{{\eta _c}}}tr\left\{ {{{\left[ {{{\tilde w}_c}\left( k \right) - {\eta _c}{\vartheta _c}\left( {{z_{c(k + 1)}}} \right){e_c}} \right]}^T}\left[ {{{\tilde w}_c}\left( k \right) - {\eta _c}{\vartheta _c}\left( {{z_{c(k + 1)}}} \right){e_c}} \right] - \tilde w_c^T\left( k \right){{\tilde w}_c}\left( k \right)} \right\}\\ & = \frac{1}{{{\eta _c}}}tr\left\{ { - 2{\eta _c}\tilde w_c^T\left( k \right){\vartheta _c}\left( {{z_{c(k + 1)}}} \right){e_c} + \eta _{_c}^2e_c^T\vartheta _c^T\left( {{z_{c(k + 1)}}} \right){\vartheta _c}\left( {{z_{c(k + 1)}}} \right){e_c}} \right\}\\ & = tr\left\{ { - 2\tilde w_c^T\left( k \right){\vartheta _c}\left( {{z_{c(k + 1)}}} \right){e_c} + {\eta _c}e_c^T\vartheta _c^T\left( {{z_{c(k + 1)}}} \right){\vartheta _c}\left( {{z_{c(k + 1)}}} \right){e_c}} \right\}. \end{split} \end{equation}$

(4.15)

The error function of the critic network is ${e_c} = \tilde w_c^T{\vartheta_c}\left({{z_{c(k+1)}}}\right)-{\zeta_{ck}}$ , let ${\xi_{ck}} = \tilde w_c^T{\vartheta_c}\left({{z_{c(k+1)}}}\right)$ . Substituting them into the above formula and using the Cauchy-Schwartz inequality, Eq (4.15) can be further derived as

$\begin{equation} \begin{aligned} \Delta {L_2}& \le 2{\eta _c}{\left\| {{\xi _{ck}}{\vartheta _c}\left( {{z_{c(k + 1)}}} \right)} \right\|^2} + 2{\eta _c}{\left\| {{\zeta _{ck}}{\vartheta _c}\left( {{z_{c(k + 1)}}} \right)} \right\|^2} - 2{\left\| {{\xi _{ck}}} \right\|^2} + tr\left\{ {2{\xi _{ck}}{\zeta _{ck}}} \right\}\\ &\le 2{\eta _c}{\left\| {{\xi _{ck}}{\vartheta _c}\left( {{z_{c(k + 1)}}} \right)} \right\|^2} + 2{\eta _c}{\left\| {{\zeta _{ck}}{\vartheta _c}\left( {{z_{c(k + 1)}}} \right)} \right\|^2} - {\left\| {{\xi _{ck}}} \right\|^2} + {\left\| {{\zeta _{ck}}} \right\|^2}\\ &\le -\left( {1 - 2{\eta _c}{{\left\| {{\vartheta _c}\left( {{z_{c(k + 1)}}} \right)} \right\|}^2}} \right){\left\| {{\xi _{ck}}} \right\|^2} + \left( {1 + 2{\eta _c}{{\left\| {{\vartheta _c}\left( {{z_{c(k + 1)}}} \right)} \right\|}^2}} \right){\left\| {{\zeta _{ck}}} \right\|^2}\\ &\le - \left( {1 - 2{\eta _c}\vartheta _{cM}^2} \right)\xi _{_{cM}}^2 + \left( {1 + 2{\eta _c}\vartheta _{cM}^2} \right)\zeta _{_{cM}}^2. \end{aligned} \end{equation}$

(4.16)

The weight estimation error of the action network can be described as

$\begin{equation} {\tilde w_a}\left( {k + 1} \right) = {\tilde w_a}\left( k \right) - {\eta _a}{\vartheta _a}\left( {{\varsigma _{ak}}} \right)\hat w_c^T{\vartheta _c}\left( {{z_{c(k + 1)}}} \right)\hat w_c^T{\varphi _{(k + 1)}}\varpi. \end{equation}$

(4.17)

The first-order difference of ${L_3}$ can be denoted as

$\begin{equation} \begin{aligned} \Delta {L_3}& = \frac{1}{{{\eta _a}}}{\rm{tr}}\left\{ {\tilde w_a^T\left( {k + 1} \right){{\tilde w}_a}\left( {k + 1} \right) - \tilde w_a^T\left( k \right){{\tilde w}_a}\left( k \right)} \right\}\\ & = \frac{1}{{{\eta _a}}}tr\left\{ {{{\left\| {{{\tilde w}_a}\left( k \right) - {\eta _a}{\vartheta _a}\left( {{\varsigma _{ak}}} \right)\hat w_c^T{\vartheta _c}\left( {{z_{c(k + 1)}}} \right)\hat w_c^T{\varphi _{(k + 1)}}\varpi } \right\|}^2} - {{\left\| {{{\tilde w}_a}\left( k \right)} \right\|}^2}} \right\}\\ &\le tr\left\{ { - 2{{\tilde w}_a}\left( k \right){\vartheta _a}\left( {{\varsigma _{ak}}} \right)\hat w_c^T{\vartheta _c}\left( {{z_{c\left( {k + 1} \right)}}} \right)\hat w_c^T{\varphi _{(k + 1)}}\varpi } \right\}+ {\eta _a}{\left\| {{\vartheta _a}\left( {{\varsigma _{ak}}} \right)\hat w_c^T{\vartheta _c}\left( {{z_{c\left( {k + 1} \right)}}} \right)\hat w_c^T{\varphi _{(k + 1)}}\varpi } \right\|^2}. \end{aligned} \end{equation}$

(4.18)

Define ${\xi_{ak}} = \tilde w_a^T{\vartheta_a}\left({{\varsigma_{ak}}}\right)$ , ${\Xi _1} = tr\left\{ { - 2{{\tilde w}_a}\left(k \right){\vartheta _a}\left({{z_{ak}}} \right)\hat w_c^T{\vartheta _c}\left({{z_{c(k + 1)}}} \right)\hat w_c^T{\varphi _{(k + 1)}}\varpi } \right\}$ and ${\Xi _2} = {\eta _a}{\left\| {{\vartheta _a}\left({{\varsigma _{ak}}} \right)\hat w_c^T{\vartheta _c}\left({{z_{c(k + 1)}}} \right)\hat w_c^T{\varphi _{(k + 1)}}\varpi } \right\|^2}$ , then we can easily deduce that

$\begin{equation} \begin{aligned} {\Xi _1}& = {\left\| {\hat w_c^T{\vartheta _c}\left( {{z_{c(k + 1)}}} \right) - \hat w_c^T{\varphi _{(k + 1)}}\varpi {\xi _{ak}}} \right\|^2} - {\left\| {\hat w_c^T{\varphi _{(k + 1)}}\varpi {\xi _{ak}}} \right\|^2} - {\left\| {\hat w_c^T{\vartheta _c}\left( {{z_{c(k + 1)}}} \right)} \right\|^2}\\ &\le {\left\| {\hat w_c^T{\varphi _{(k + 1)}}\varpi } \right\|^2}{\left\| {{\xi _{ak}}} \right\|^2} + {\left\| {\hat w_c^T{\vartheta _c}\left( {{z_{c(k + 1)}}} \right)} \right\|^2}\\ &\le \frac{1}{2}{\left\| {{\xi _{ak}}} \right\|^4} + \frac{1}{2}{\left\| {\hat w_c^T{\varphi _{(k + 1)}}\varpi } \right\|^4} + {\left\| {\hat w_c^T{\vartheta _c}\left( {{z_{c(k + 1)}}} \right)} \right\|^2}, \end{aligned} \end{equation}$

(4.19)

$\begin{equation} \begin{aligned} {\Xi _2}& \le - \left[ {{{\left\| {\hat w_c^T{\varphi _{(k + 1)}}\varpi } \right\|}^2} - {\eta _a}{{\left\| {{\vartheta _a}\left( {{\varsigma _{ak}}} \right)\hat w_c^T{\varphi _{(k + 1)}}\varpi } \right\|}^2}} \right]{\left\| {\hat w_c^T{\vartheta _c}\left( {{z_{c(k + 1)}}} \right)} \right\|^2} + {\left\| {\hat w_c^T{\varphi _{(k + 1)}}\varpi } \right\|^2}{\left\| {\hat w_c^T{\vartheta _c}\left( {{z_{c(k + 1)}}} \right)} \right\|^2}\\ &\le - \left[ {1 - {\eta _a}{{\left\| {{\vartheta _a}\left( {{\varsigma _{ak}}} \right)} \right\|}^2}} \right]{\left\| {\hat w_c^T{\varphi _{(k + 1)}}\varpi } \right\|^2}{\left\| {\hat w_c^T{\vartheta _c}\left( {{z_{c(k + 1)}}} \right)} \right\|^2} + \frac{1}{2}{\left\| {\hat w_c^T{\vartheta _c}\left( {{z_{c(k + 1)}}} \right)} \right\|^4} + \frac{1}{2}{\left\| {\hat w_c^T{\varphi _{(k + 1)}}\varpi } \right\|^4}. \end{aligned} \end{equation}$

(4.20)

Combined (4.18) and (4.19) with (4.20), $\Delta{L_3}$ satisfies

$\begin{equation} \begin{aligned} \Delta {L_3}& \le - \left[ {1 - {\eta _a}{{\left\| {{\vartheta _a}\left( {{\varsigma _{ak}}} \right)} \right\|}^2}} \right]{\left\| {\hat w_c^T{\varphi _{(k + 1)}}\varpi } \right\|^2}{\left\| {\hat w_c^T{\vartheta _c}\left( {{z_{c(k + 1)}}} \right)} \right\|^2} + {F^2}\\ &\le - \left[ {1 - {\eta _a}\vartheta _{aM}^2} \right]w_{_{cM}}^4\varphi _M^2{\varpi ^2}\vartheta _{cM}^2 + F_M^2, \end{aligned} \end{equation}$

(4.21)

where $F_M^2$ defines as

$\begin{equation} \begin{aligned} {F^2}& = \frac{1}{2}{\left\| {{\xi _{ak}}} \right\|^4} + {\left\| {\hat w_c^T{\varphi _{(k + 1)}}\varpi } \right\|^4} + {\left\| {\hat w_c^T{\vartheta _c}\left( {{z_{c(k + 1)}}} \right)} \right\|^2} + \frac{1}{2}{\left\| {\hat w_c^T{\vartheta _c}\left( {{z_{c(k + 1)}}} \right)} \right\|^4}\\ &\le \frac{1}{2}\xi _{aM}^4 + w_{cM}^4\varphi _M^4\varpi _M^4 + w_{cM}^2\vartheta _{cM}^2 + \frac{1}{2}w_{cM}^4\vartheta _{cM}^4 = F_M^2. \end{aligned} \end{equation}$

(4.22)

Based on (3.9), (4.16) and (4.21), we can conclude that

$\begin{equation} \begin{split} \Delta L \le - {\lambda _{\min }}\left( Q \right)\left( {\alpha - 2} \right){\left\| {x\left( k \right)} \right\|^2} - \left( {1 - 2{\eta _c}\vartheta _{_{cM}}^2} \right)\xi _{_{cM}}^2 - \left[ {1 - {\eta _a}\vartheta _{aM}^2} \right]w_{_{cM}}^4\varphi _M^2\varpi _M^2\vartheta _{_{cM}}^2 + D_M^2. \end{split} \end{equation}$

(4.23)

According to (4.12), then the derivative of $V$ is negative. □

Remark 4. In this section, it is proved that the system states and the estimation errors of the neural network weights are uniformly ultimately bounded (UUB). It implies that the cost function and control law can converge to the neighborhoods of the optimal. The convergence of the system is demonstrated theoretically. Hence, the proposed method in this paper is more effective.

5. Simulation

Example 1. Consider the following mass-spring-damper system ^[23]:

$\begin{equation} \left\{ {\begin{array}{*{20}{l}} {{{\dot x}_1} = {x_2}},\\ {{{\dot x}_2} = - \dfrac{b}{m}{x_2} - \dfrac{k}{m}{x_1} + \dfrac{F}{m}}, \end{array}} \right. \end{equation}$

(5.1)

where $m = 1kg$ and $b = 3N·s/m$ are mass and the drag force of the body. $k = 9N/m$ is the linear spring constant. The control law $u(k)$ of the system is the force $F$ from outside. Choose the initial state vector as $x\left(0\right) = {\left[{-0.5, 0.5}\right]^T}$ , the constants $\alpha = 2.5$ and $C = 0.3$ . Based on the Euler method, the system can be discrete as

$\begin{equation} \left\{{\begin{array}{*{20}{c}} {{x_1}\left({k+1}\right) = 0.0099{x_2}\left(k\right)+0.9996{x_1}\left(k\right),\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} {}&{} \end{array}}&{} \end{array}}&{} \end{array}}\\ {{x_2}\left({k+1} \right) = -0.0887{x_1}(k)+0.97{x_2}\left(k\right)+0.0099u\left(k\right)}. \end{array}}\right. \end{equation}$

(5.2)

Set the control constraints as $\left|{{u_j}}\right| < 0.1$ . Considering that $u$ is one-dimensional, then the control constraint is designed as $\left|u\right| < 0.1$ . Let the parameters $Q = {I_2}$ and $R = I$ , which ${I_2}$ and $I$ represent the identity matrix with appropriate dimensions. We choose three-layer neural networks to implement the algorithm. For model networks, 500 data samples are used to train and another 500 samples to test its performance. Then we train the critic network and action network for 500 iterations to make sure the given accuracy $\varepsilon = {10^{-5}}$ is reached. In the training process, the learning rate is ${\eta_m} = {\eta_c} = {\eta_a} = 0.05$ .

Moreover, in order to make comparisons with the event-triggered HDP algorithm proposed in ^[23], we also present the controller designed by the event-based HDP algorithm. Then, we apply the optimal control laws designed by event-based ADDHP and HDP techniques to the system 500 times, respectively. The state curves by using these two methods are shown in Figures 2 and 3. It is evident that the proposed method converges faster and performs better than the HDP algorithm based on the event-triggered control. The corresponding control curves are shown in Figure 4. Apparently, the control law is updated only when the triggering condition is violated. As displayed in Figure 4, it can be seen from the simulation results that the controller derived by the event-based ADDHP algorithm can reduce the number of controller updates and converge faster while ensuring system performance.

Figure 2. The trajectories of the current angle

$x_1$ .

DownLoad: Full-Size Img PowerPoint

Figure 3. The trajectories of the angular velocity

$x_2$ .

DownLoad: Full-Size Img PowerPoint

Figure 4. The trajectories of the control input

$u(k)$ .

DownLoad: Full-Size Img PowerPoint

Example 2. Consider the discrete-time nonlinear system:

$\begin{equation} \left\{ {\begin{array}{*{20}{l}} {{x_1}\left( {k + 1} \right) = {x_1}\left( k \right) + 0.1{x_2}\left( k \right)},\\ {{x_2}\left( {k + 1} \right) = - 0.17\sin ({x_1}\left( k \right)) + 0.98{x_2}\left( k \right) + 0.1{u_1}\left( k \right)},\\ {{x_3}\left( {k + 1} \right) = 0.1{x_1}\left( k \right) + 0.2{x_2}\left( k \right) + {x_3}\left( k \right)\cos ({u_2}\left( k \right))}, \end{array}} \right. \end{equation}$

(5.3)

where the state vector is $x\left(k \right) = {\left[ {{x_1}\left(k \right), {x_2}\left(k \right), {x_3}\left(k \right)} \right]^T}$ and the control input is $u\left(k\right) = {\left[{{u_1}\left(k\right), {u_2}\left(k\right)} \right]^T}$ . The weight matrices of the utility function are set as $Q = {I_3}, R = 0.01{I_2}$ . The constraint boundary is set as 3.

The learning rates and other relevant parameters of the model component are chosen the same as Example 1, but with the structure 5-8-3. We apply the developed algorithm to train the critic network (5-8-3) and the action network (3-8-2). The initial weights of these two networks are selected the same as that in Example 1. Here, the initial state vector is chosen as $x\left(k\right) = {\left[{0.5, 0.5, 0.5}\right]^T}$ . For adopting the event-based mechanism, we set the parameters of the threshold as $\alpha = 3, C = 0.2$ . Then, the state trajectories of the developed method and the event-triggered HDP algorithm are shown in Figures 5–7. The corresponding control curves are shown in Figures 8 and 9. Remarkably, an evident improvement of the resource utilization has been obtained under event-driven formulation. From these results, we observe that the system performance can be maintained while the control efficiency has been signally enhanced, which demonstrates the effectiveness of the event-driven ADDHP approach. Moreover, the convergence rate of the system is faster than the event-driven HDP algorithm.

Figure 5. The trajectories of the system state

$x_1$ .

DownLoad: Full-Size Img PowerPoint

Figure 6. The trajectories of the system state

$x_2$ .

DownLoad: Full-Size Img PowerPoint

Figure 7. The trajectories of the system state

$x_3$ .

DownLoad: Full-Size Img PowerPoint

Figure 8. The trajectories of the control input

$u_1$ .

DownLoad: Full-Size Img PowerPoint

Figure 9. The trajectories of the control input

$u_2$ .

DownLoad: Full-Size Img PowerPoint

6. Conclusions

In this paper, a novel event-triggered control method has been studied for discrete-time nonlinear systems with constrained input. A novel triggering condition is designed with a simpler form and fewer assumptions. Moreover, it also proves that the states and the estimation errors of the neural network weights are uniformly ultimately bounded. The simulation example emphasizes that the proposed method can cut computational burden while ensuring system performance. However, due to the complexity of the actual system, the full state feedback is infeasible. Therefore, other feedback control methods will need to be further studied in the future.

Use of AI tools declaration

The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

Conflict of interest

The authors declare that they have no conflicts of interest.

References

[1]	D. Liu, S. Xue, B. Zhao, B. Luo, Q. Wei, Adaptive dynamic programming for control: a survey and recent advances, IEEE T. Syst. Man Cy., 51 (2021), 142–160. https://doi.org/10.1109/TSMC.2020.3042876 doi: 10.1109/TSMC.2020.3042876
[2]	Y. Zhang, B. Zhao, D. Liu, Deterministic policy gradient adaptive dynamic programming for model-free optimal control, Neurocomputing, 387 (2020), 40–50. https://doi.org/10.1016/j.neucom.2019.11.032 doi: 10.1016/j.neucom.2019.11.032
[3]	M. Ha, D. Wang, D. Liu, A novel value iteration scheme with adjustable convergence rate, IEEE T. Neur. Net. Lear., in press. https://doi.org/10.1109/TNNLS.2022.3143527
[4]	C. Mu, D. Wang, H. He, Novel iterative neural dynamic programming for data-based approximate optimal control design, Automatica, 81 (2017), 240–252. https://doi.org/10.1016/j.automatica.2017.03.022 doi: 10.1016/j.automatica.2017.03.022
[5]	L. Dong, X. Zhong, C. Sun, H. He, Adaptive event-triggered control based on heuristic dynamic programming for nonlinear discrete-time systems, IEEE T. Neur. Net. Lear., 28 (2017), 1594–1605. https://doi.org/10.1109/TNNLS.2016.2541020 doi: 10.1109/TNNLS.2016.2541020
[6]	T. Li, D. Yang, X. Xie, H. Zhang, Event-triggered control of nonlinear discrete-time system with unknown dynamics based on HDP ( $\lambda$ ), IEEE T. Cybernetics, 52 (2021), 6046–6058. https://doi.org/10.1109/TCYB.2020.3044595 doi: 10.1109/TCYB.2020.3044595
[7]	J. Lu, Q. Wei, T. Zhou, Z. Wang, F. Wang, Event-triggered near-optimal control for unknown discrete-time nonlinear systems using parallel control, IEEE T. Cybernetics, 53 (2023), 1890–1904. https://doi.org/10.1109/TCYB.2022.3164977 doi: 10.1109/TCYB.2022.3164977
[8]	J. Wang, Y. Wang, Z. Ji, Model-free event-triggered optimal control with performance guarantees via goal representation heuristic dynamic programming, Nonlinear Dyn., 108 (2022), 3711–3726. https://doi.org/10.1007/s11071-022-07438-y doi: 10.1007/s11071-022-07438-y
[9]	Z. Wang, J. Lee, X. Sun, Y. Chai, Y. Liu, Self-learning optimal control with performance analysis using event-triggered adaptive dynamic programming, Proceedings of 5th International Conference on Crowd Science and Engineering, 2021, 29–34. https://doi.org/10.1145/3503181.3503187
[10]	S. Xue, B. Luo, D. Liu, Y. Gao, Event-triggered ADP for tracking control of partially unknown constrained uncertain systems, IEEE T. Cybernetics, 52 (2022), 9001–9012. https://doi.org/10.1109/TCYB.2021.3054626 doi: 10.1109/TCYB.2021.3054626
[11]	D. Wang, M. Zhao, M. Ha, J. Ren, Neural optimal tracking control of constrained nonaffine systems with a wastewater treatment application, Neural Networks, 143 (2021), 121–132. https://doi.org/10.1016/j.neunet.2021.05.027 doi: 10.1016/j.neunet.2021.05.027
[12]	J. Lu, Q. Wei, Y. Liu, T. Zhou, F. Wang, Event-triggered optimal parallel tracking control for discrete-time nonlinear systems, IEEE T. Syst. Man Cy., 52 (2022), 3772–3784. https://doi.org/10.1109/TSMC.2021.3073429 doi: 10.1109/TSMC.2021.3073429
[13]	K. Wang, Q. Gu, B. Huang, Q. Wei, T. Zhou, Adaptive event-triggered near-optimal tracking control for unknown continuous-time nonlinear systems, IEEE Access, 10 (2022), 9506–9518. https://doi.org/10.1109/ACCESS.2021.3140076 doi: 10.1109/ACCESS.2021.3140076
[14]	Q. Wei, J. Lu, T. Zhou, X. Cheng, F. Wang, Event-triggered near-optimal control of discrete-time constrained nonlinear systems with application to a boiler-turbine system, IEEE T. Ind. Inform., 18 (2022), 3926–3935. https://doi.org/10.1109/TII.2021.3116084 doi: 10.1109/TII.2021.3116084
[15]	D. Wang, L. Hu, M. Zhao, J. Qiao, Adaptive critic for event-triggered unknown nonlinear optimal tracking design with wastewater treatment applications, IEEE T. Neur. Net. Lear., in press. https://doi.org/10.1109/TNNLS.2021.3135405
[16]	B. Sun, E. van Kampen, Event-triggered constrained control using explainable global dual heuristic programming for nonlinear discrete-time systems, Neurocomputing, 468 (2022), 452–463. https://doi.org/10.1016/j.neucom.2021.10.046 doi: 10.1016/j.neucom.2021.10.046
[17]	S. Xue, B. Luo, D. Liu, Y. Li, Adaptive dynamic programming based event-triggered control for unknown continuous-time nonlinear systems with input constraints, Neurocomputing, 396 (2020), 191–200. https://doi.org/10.1016/j.neucom.2018.09.097 doi: 10.1016/j.neucom.2018.09.097
[18]	S. Zhang, B. Zhao, Y. Zhang, Event-triggered control for input constrained non-affine nonlinear systems based on neuro-dynamic programming, Neurocomputing, 440 (2021), 175–184. https://doi.org/10.1016/j.neucom.2021.01.116 doi: 10.1016/j.neucom.2021.01.116
[19]	X. Yang, Q. Wei, Adaptive critic learning for constrained optimal event-triggered control with discounted cost, IEEE T. Neur. Net. Lear., 32 (2021), 91–104. https://doi.org/10.1109/TNNLS.2020.2976787 doi: 10.1109/TNNLS.2020.2976787
[20]	M. Ha, D. Wang, D. Liu, Event-triggered adaptive critic control design for discrete-time constrained nonlinear systems, IEEE T. Syst. Man Cy., 50 (2020), 3158–3168. https://doi.org/10.1109/TSMC.2018.2868510 doi: 10.1109/TSMC.2018.2868510
[21]	M. Ha, D. Wang, D. Liu, B. Zhao, Adaptive event-based control for discrete-time nonaffine systems with constrained inputs, Proceedings of Eighth International Conference on Information Science and Technology (ICIST), 2018,104–109. https://doi.org/10.1109/ICIST.2018.8426093
[22]	B. Luo, Y. Yang, D. Liu, H. Wu, Event-triggered optimal control with performance guarantees using adaptive dynamic programming, IEEE T. Neur. Net. Lear., 31 (2020), 76–88. https://doi.org/10.1109/TNNLS.2019.2899594 doi: 10.1109/TNNLS.2019.2899594
[23]	Z. Wang, Q. Wei, D. Liu, A novel triggering condition of event-triggered control based on heuristic dynamic programming for discrete-time systems, Optim. Contr. Appl. Meth., 39 (2018), 1467–1478. https://doi.org/10.1002/oca.2421 doi: 10.1002/oca.2421
[24]	C. Mu, K. Liao, K. Wang, Event-triggered design for discrete-time nonlinear systems with control constraints, Nonlinear Dyn., 103 (2021), 2645–2657. https://doi.org/10.1007/s11071-021-06218-4 doi: 10.1007/s11071-021-06218-4
[25]	M. Ha, D. Wang, D. Liu, Event-triggered constrained control with dhp implementation for nonaffine discrete-time systems, Inform. Sciences, 519 (2020), 110–123. https://doi.org/10.1016/j.ins.2020.01.020 doi: 10.1016/j.ins.2020.01.020

This article has been cited by:

Jiaojiao Li, Yingying Wang, Jianyu Zhang, Event-triggered sliding mode control for a class of uncertain switching systems, 2023, 8, 2473-6988, 29424, 10.3934/math.20231506

Reader Comments

Your name:*

Email:*
© 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

AIMS Mathematics

1.8 3.4

Metrics

Article views(1341) PDF downloads(78) Cited by(1)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

Figures and Tables

Figures(9)

AIMS Mathematics

A novel event-triggered constrained control for nonlinear discrete-time systems

Related Papers:

Abstract

1. Introduction

2. Problem description

3. Triggering condition and stability analysis

4. Event-triggered control with the ADDHP technique

4.1. Model network

4.2. Critic network

4.3. Action network

4.4. Convergence analysis

5. Simulation

6. Conclusions

Use of AI tools declaration

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Catalog

Abstract

1. Introduction

2. Problem description

3. Triggering condition and stability analysis

4. Event-triggered control with the ADDHP technique

4.1. Model network

4.2. Critic network

4.3. Action network

4.4. Convergence analysis

5. Simulation

6. Conclusions

Use of AI tools declaration

Conflict of interest

References

AIMS Mathematics

A novel event-triggered constrained control for nonlinear discrete-time systems

Related Papers:

Abstract

1. Introduction

2. Problem description

3. Triggering condition and stability analysis

4. Event-triggered control with the ADDHP technique

4.1. Model network

4.2. Critic network

4.3. Action network

4.4. Convergence analysis

5. Simulation

6. Conclusions

Use of AI tools declaration

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog

Abstract

1. Introduction

2. Problem description

3. Triggering condition and stability analysis

4. Event-triggered control with the ADDHP technique

4.1. Model network

4.2. Critic network

4.3. Action network

4.4. Convergence analysis

5. Simulation

6. Conclusions

Use of AI tools declaration

Conflict of interest

References