
In this paper, a novel event-triggered optimal control method is developed for nonlinear discrete-time systems with constrained inputs. First, a non-quadratic utility function is constructed to overcome the challenge caused by saturating actuators. Second, a novel triggering condition is designed to reduce computational burden. Difference from other triggering conditions, fewer assumptions are required to guarantee asymptotic stability. Then, the optimal cost function and control law are obtained by constructing the action-critic network. Convergence analysis of the system is provided in the consideration of the system state and neural network weight estimation errors. Finally, the effectiveness and correctness of the proposed method are verified by two numerical examples.
Citation: Yuanyuan Cheng, Yuan Li. A novel event-triggered constrained control for nonlinear discrete-time systems[J]. AIMS Mathematics, 2023, 8(9): 20530-20545. doi: 10.3934/math.20231046
[1] | Le You, Chuandong Li, Xiaoyu Zhang, Zhilong He . Edge event-triggered control and state-constraint impulsive consensus for nonlinear multi-agent systems. AIMS Mathematics, 2020, 5(5): 4151-4167. doi: 10.3934/math.2020266 |
[2] | Zuo Wang, Hong Xue, Yingnan Pan, Hongjing Liang . Adaptive neural networks event-triggered fault-tolerant consensus control for a class of nonlinear multi-agent systems. AIMS Mathematics, 2020, 5(3): 2780-2800. doi: 10.3934/math.2020179 |
[3] | Kairui Chen, Yongping Du, Shuyan Xia . Adaptive state observer event-triggered consensus control for multi-agent systems with actuator failures. AIMS Mathematics, 2024, 9(9): 25752-25775. doi: 10.3934/math.20241258 |
[4] | Hongjie Li . H-infinity bipartite consensus of multi-agent systems with external disturbance and probabilistic actuator faults in signed networks. AIMS Mathematics, 2022, 7(2): 2019-2043. doi: 10.3934/math.2022116 |
[5] | Liping Luo, Yonggang Chen, Jishen Jia, Kaixin Zhao, Jinze Jia . Event-triggered anti-windup strategy for time-delay systems subject to saturating actuators. AIMS Mathematics, 2024, 9(10): 27721-27738. doi: 10.3934/math.20241346 |
[6] | Biwen Li, Yujie Liu . Quasi-synchronization of nonlinear systems with parameter mismatch and time-varying delays via event-triggered impulsive control. AIMS Mathematics, 2025, 10(2): 3759-3778. doi: 10.3934/math.2025174 |
[7] | Linni Li, Jin-E Zhang . Input-to-state stability of nonlinear systems with delayed impulse based on event-triggered impulse control. AIMS Mathematics, 2024, 9(10): 26446-26461. doi: 10.3934/math.20241287 |
[8] | Tao Xie, Xing Xiong . Finite-time synchronization of fractional-order heterogeneous dynamical networks with impulsive interference via aperiodical intermittent control. AIMS Mathematics, 2025, 10(3): 6291-6317. doi: 10.3934/math.2025287 |
[9] | Jiaqi Liang, Zhanheng Chen, Zhiyong Yu, Haijun Jiang . Fixed-time consensus of second-order multi-agent systems based on event-triggered mechanism under DoS attacks. AIMS Mathematics, 2025, 10(1): 1501-1528. doi: 10.3934/math.2025070 |
[10] | Chao Ma, Tianbo Wang, Wenjie You . Master-slave synchronization of Lurie systems with time-delay based on event-triggered control. AIMS Mathematics, 2023, 8(3): 5998-6008. doi: 10.3934/math.2023302 |
In this paper, a novel event-triggered optimal control method is developed for nonlinear discrete-time systems with constrained inputs. First, a non-quadratic utility function is constructed to overcome the challenge caused by saturating actuators. Second, a novel triggering condition is designed to reduce computational burden. Difference from other triggering conditions, fewer assumptions are required to guarantee asymptotic stability. Then, the optimal cost function and control law are obtained by constructing the action-critic network. Convergence analysis of the system is provided in the consideration of the system state and neural network weight estimation errors. Finally, the effectiveness and correctness of the proposed method are verified by two numerical examples.
Optimality is one of the most significant properties of a control system. Generally, the framework of the Hamilton-Jacobi-Bellman (HJB) equation is applied to solve the optimal control problem. Nevertheless, it is formidable to obtain its analytical solutions. Therefore, adaptive dynamic programming (ADP) method have been widely used to approximate its numerical solutions [1,2,3,4]. With the deepening of the research, ADP has showed great development potential.
However, energy loss is the focus of today's industrial development with the resource consumption and increasing energy depletion. The event-triggered technique can greatly reduce the transmission and update of information. As an advanced sampling method, the essence of the event-triggered mechanism is to decide the controller update by choosing an appropriate triggering condition, which achieves the purpose of saving energy [5,6,7]. Wang et al. designed a novel adaptive event-triggering condition, solving the event-triggered control problem for discrete-time nonlinear systems [8]. Wei et al. studied the self-learning optimal regulation problem of discrete-time nonlinear systems based on events and proved that a suitable triggering condition can ensure the stability of the system [9]. Event-triggered control is also widely applied in tracking control problems [10,11,12,13] and other fields [14]. Hu et al. developed an event-based approximate optimal tracking control problem of discrete-time nonlinear systems [15]. Luo et al. introduced a novel event-triggered control policy and gave detail Lyapunov analysis for continuous-time systems [16].
Besides, due to the wide existence of physical constraints, practical systems are inevitably subject to saturation nonlinearities. Control constraints can easily damage the overall performance of the system. Additionally, it is more difficult to design the controller than the general case. Therefore, there is a great interest in the study of various systems with control constraints [17,18,19]. Ha et al. solved the constrained control problem by minimizing a novel nonquadratic cost function [20]. Ha et al. investigated an event-based controller for the near-optimal control policy of discrete-time systems with constrained inputs [21]. For the asymmetric input constraint problem, Sun et al. developed an event-triggered optimal control method [22]. Liu et al. designed a novel triggering condition with simple form and few assumptions, solving the optimal control problem by using the heuristic dynamic programming (HDP) algorithm [23]. Considering the constrained-input problem, Liao et al. proposed an event-triggered dual heuristic dynamic programming (DHP) algorithm [24]. Mu et al. applied the global dual heuristic dynamic programming (GDHP) algorithm to solve the event-triggered constraint control of nonlinear discrete-time systems [25]. Compared with the HDP and DHP structures, the action-dependent dual Heuristic programming (ADDHP) structure learns more system information, which enables the ADDHP method to obtain better control performance. This has motivated our study.
Given that the ADDHP algorithm has many advantages, we investigated a novel event-triggered control method using this algorithm. The main contributions of this paper are listed as follows:
(1) A novel triggering condition is designed, which can effectively reduce the number of events occurring. Additionally, under this triggering condition, the stability of the system is proved with fewer assumptions. Hence, the novel event-based ADDHP algorithm is more practical for application.
(2) The convergence for the cost function and control inputs is proved theoretically.
(3) In the action-critic network, the influence of the control input on the cost function is considered. Thus, this method has a faster convergence rate and a higher approximate accuracy.
This paper is arranged as follows: Section 2 states the event-triggered constrained control problem. A novel triggering condition and the stability analysis of the system are provided in Section 3. Section 4 briefly introduces the implementation of the ADDHP algorithm and analyzes the convergence of the system states and neural network weights. In Section 5, two simulation examples are presented to verify the correctness of the proposed algorithm. Finally, some conclusions and the prospects for the future are given in Section 6.
Consider the following nonlinear discrete-time system with constrained inputs:
x(k+1)=F(x(k),u(k)),k=0,1,2,⋯, | (2.1) |
where x(k)∈Rn is the state vector, u(k)∈Rm is the control input, F(⋅,⋅) is an unknown system function. Ωu={u(k)|u(k)=[u1(k),u2(k),⋯,um(k)]T∈Rm,|uj(k)|≤ˉuj,j=1,2,⋯,m}, where ˉuj is the saturation level of the jth actuator. The origin x(k)=0 is the unique equilibrium point of the system (2.1) under u(k)=0, i.e., F(0,0)=0.
Assumption 1. [23] System (2.1) is controllable and observable, unknown system function F:Rn×Rm→Rn is Lipschitz continuous.
Assumption 1 implies that there exists a continuous state feedback control policy u(k)=μ(x(k)),μ:Rn→Rm that can stabilize system (2.1) to the equilibrium point.
In the event-triggered control, we define a monotone increasing time sequence {ki}∞i=0 as sampling sequence. When the triggering condition is satisfied, the control input keeps constant during the time interval [ki,ki+1) by involving a zero-order hold (ZOH). Therefore, the feedback control law can be expressed as
u(x(k))=μ(x(ki)). | (2.2) |
Due to a gap or difference between the sampling state x(ki) and the current state x(k), then the triggering error is described as
e(k)=x(ki)−x(k). | (2.3) |
Only when e(k)=0, i.e., x(k)=x(ki),i=0,1,2,⋯, the current status is marked as the sampling status and transferred to the controller to update the system control law. The control law can be rewritten as u(x(k))=μ(x(k)+e(k)), so system (2.1) can be rewritten as
x(k+1)=F(x(k),μ(x(k)+e(k))). | (2.4) |
The utility function is described as
U(x(k),μ(x(ki)))=xT(k)Qx(k)+T(μ(x(ki)))=xT(k)Qx(k)+2∫μ(x(ki))0tanh−T(ˉU−1v)ˉURdv, | (2.5) |
where Q∈Rn×n and R are symmetric positive definite matrices, T(μ(x(ki))) is a positive non-quadratic function and can ensure that the control input μ(x(ki)) does not exceed the constraint boundary. ˉU∈Rm×m is a constant diagonal matrix by ˉU=diag{ˉu1,ˉu2,⋯,ˉum}.
The purpose of optimal control is to search for an optimal control strategy μ∗(x(ki)) to minimize the cost function:
J(x(k))=∞∑i=kU(x(i),μ(x(ki))). | (2.6) |
For the cost function J(x(k)), its Hamiltonian function is expressed as
H(x,μ,∇J)=U(x(i),μ(x(ki)))+∇JT(x)F(x,u), | (2.7) |
where ∇J(⋅)=∂J(⋅)/∂x(⋅). According to Bellman's optimality principle, the optimal cost function J∗(x(ki)) can be gained by solving the following HJB equation:
minμ∈ΩuH(x,μ,∇J∗)=0, | (2.8) |
where ∇J∗(0)=0, the optimal control law can be expressed as
μ∗(x(ki))=argminμ∈ΩuH(x,μ,∇J∗). | (2.9) |
In the following section, we will prove that the system is asymptotically stable under the designed triggering condition.
Design the triggering condition in the following form:
‖e(k)‖≤eT=√1−αC22C2‖x(ki)‖, | (3.1) |
where C∈(0,1/√α) and α∈(2,1/C2) are normal numbers. The triggering threshold eT is not unique, which is influenced by the system sampling status x(ki) and the designed constants α and C. Then the next triggering point can be achieved by
ki+1=inf{k|‖e(k)‖>eT,k>ki}. | (3.2) |
For discrete-time systems, the minimal inter-sample time is bounded by a nonzero positive constant, then Zeno behavior can be eliminated.
Remark 1. The threshold has a similar form to that proposed in [23]. This paper introduces the parameter α that interacts with C. By adjusting these two parameters, the novel triggering condition can achieve higher resource utilization efficiency. It will be shown in the simulation example later. Compared with [24] and [25], the triggering condition designed in this paper is easy to implement and requires fewer assumptions.
Definition 1. [23] There exist some κ∞ function α1,α2,α3 and a κ function β, which make the following inequality hold:
α1(‖x(k)‖)≤V(x(k))≤α2(‖x(k)‖), | (3.3) |
V(F(x(k),μ(x(k)+e(k))))−V(x(k))≤−α3(‖x(k)‖)+β(‖e(k)‖), | (3.4) |
then the function V:Rn→R is called an input-to-state stability (ISS) Lyapunov function.
Assumption 2. [25] There exists a normal number C∈(0,1/√α), which makes the following inequality hold:
F(x(k),μ(x(k)+e(k)))≤C‖x(k)‖+C‖e(k)‖. | (3.5) |
Theorem 3.1. Suppose that Assumptions 1 and 2 hold and the triggering condition is determined by (3.1), then the nonaffine system (2.4) is asymptotically stable.
Proof. Define the following Lyapunov function:
V(x(k+1))=xT(k+1)Qx(k+1)+T(μ(x(ki))). | (3.6) |
For the case of k∈[ki,ki+1), the control law stored in ZOH updates the system. The Lyapunov function is only related to the system state.
The first-order difference of V is
ΔV(x(k+1))=xT(k+1)Qx(k+1)−xT(k)Qx(k)=λmin(Q)[‖x(k+1)‖2−‖x(k)‖2]. | (3.7) |
Define α3(‖x(k)‖)=λmin(Q)(1−2C2)‖x(k)‖2 and β(‖e(k)‖)=λmin(Q)2C2‖e(k)‖2, according to Assumption 2 and the Cauchy-Schwarz inequality, we can deduce that
ΔV(x(k+1))=λmin(Q)[(C‖x(k)‖+C‖e(k)‖)2−‖x(k)‖2]≤λmin(Q)[(2C2−1)‖x(k)‖2+2C2‖e(k)‖2]=−α3(‖x(k)‖)+β(‖x(k)‖). | (3.8) |
According to Definition 1, V is an ISS Lyapunov function. Substitute the triggering condition to (3.8), we can obtain
ΔV(x(k+1))≤λmin(Q)[(2C2−1)‖x(k)‖2+(1−αC2)‖x(k)‖2]≤λmin(Q)(2−α)C2‖x(k)‖2. | (3.9) |
Since α∈(2,1/C2), then ΔV≤0. Therefore, the system (2.4) based on events is asymptotically stable.
Remark 2. Compared with [24] and [25], this paper demonstrates stability with fewer conditions. When the triggering condition is violated at the time instant k+1, the system will work under the updated control law, which is equivalent to the time-triggered control at k+1. According to the optimal control theory, stability can be guaranteed in this single instant.
Utilizing the advantages of neural networks, three networks are established to approximate the system dynamics, costate function and control law respectively. Moreover, the event-triggered technique is introduced to lessen the communication bandwidth. The simple diagram of the ETOC scheme is illustrated in Figure 1.
For simple representation, we define some notations before presenting the main results. The weight matrix of the input-to-hidden layer is expressed as w, and the weight matrix of the hidden-to-output layer is denoted as v. Set the activation function as ϑ(t)=(1−e−t)/(1+e−t), ζ and η represent the approximation error and the learning rate respectively.
The model network is employed to identify the system dynamics x(k+1). Then x(k+1) can be represented as
x(k+1)=wTmϑm(σmk)+ζmk, | (4.1) |
where σmk=vTmθk, θk=[xT(k),uT(k)]T is the input vector. Considering that the optimal weight vector wm is usually unknown, we approximate the optimal weight vector wm with ˆwm, then the system state can be estimated as
ˆx(k+1)=ˆwTmϑm(σmk). | (4.2) |
The error function of the model network can be denoted as em=ˆx(k+1)−x(k+1), the objective performance function Em can be defined as
Em=12eTmem. | (4.3) |
We apply the gradient descent algorithm to update ˆwm:
ˆwm(k+1)=ˆwmk−ηm∂Em∂ˆwmk, | (4.4) |
∂Em∂ˆwmk=∂Em∂em∂em∂ˆx(k+1)∂ˆx(k+1)∂ˆwmk=emϑm(σmk). | (4.5) |
The critic network is used to approximate the costate function, which can be described as
ˆλ(i+1)(x(k+1))=ˆwTcϑc(zc(k+1)), | (4.6) |
where zc(k+1)=vTcπk+1, πk+1=[ˆxT(k+1),ˆuT(k+1)]T represents the input vector, and ˆλ(x(k+1))=∂ˆJ(x(k+1))/∂x(k+1), ˆλ(x(k+1)) is the estimation of λ(x(k+1)).
We define the error function of the critic network as ec=ˆλ(i+1)(x(k+1))−λ(i+1)(x(k+1)). The critic network is supposed to minimize the performance measure Ec=12eTcec.
The weight tuning law is designed to obey a gradient-descent algorithm:
ˆwc(k+1)=ˆwck−ηc∂Ec∂ˆwck, | (4.7) |
∂Ec∂ˆwck=∂Ec∂ec∂ec∂ˆλ(x(k+1))∂ˆλ(x(k+1))∂ˆwck=ecϑc(zc(k+1)). | (4.8) |
The input of the action network is the sampling state x(ki), which is used to obtain the control law μ(x(ki)). Then μ(x(ki)) can be estimated as
ˆμ(x(ki))=ˆwTaϑa(ςak), | (4.9) |
where ςak=vTax(ki). Define the error function as ea=ˆλ(i+1)(x(k+1))−JC, where JC=0 expresses the desired ultimate targets, and is set to 0, generally. Thus, the target performance measure can be designed as Ea=12eTaea.
According to the gradient-descent algorithm, the weights can be updated as
ˆwa(k+1)=ˆwak−ηa∂Ea∂ˆwak, | (4.10) |
∂Ea∂ˆwak=eaˆwTcφ(k+1)vTcˆwTmψkvTmρϑa(ςak)=ϑa(ςak)ˆwTcϑc(zc(k+1))ˆwTcφ(k+1)ϖ, | (4.11) |
where ϖ=vTcˆwTmψkvTmρ, ρ∈R(n+m)×m,φk+1∈Rhc×hc,ψk∈Rhm×hm are represented as ρ=[0n×mIm×m], φk+1=12[1−ϑ2c(zc(k+1),1)⋯1−ϑ2c(zc(k+1),hc)⋮⋱⋮1−ϑ2c(zc(k+1),1)⋯1−ϑ2c(zc(k+1),hc)], ψk=12[1−ϑ2m(zmk,1)⋯1−ϑ2m(zmk,hm)⋮⋱⋮1−ϑ2m(zmk,1)⋯1−ϑ2m(zmk,hm)] respectively. ϖ will remain as a constant matrix after the model network is well-trained.
Remark 3. In the ADDHP algorithm, two action networks are constructed to approximate the control laws at the time instants k and k+1. The outputs of the second action network are used to approximate the costate function. The effect of the control input on the costate function is considered, then the ADDHP structure can learn more system information compared to HDP and DHP structure. Thus, the proposed approach has a higher approximate accuracy and a faster convergence rate.
Assumption 3. Assume that:
(1)The activation function ϑ and the reconstruction error ζ are bounded, such that ‖ϑc‖≤ϑcM,‖ϑa‖≤ϑaM,‖ζck‖≤ζcM, where ϑM, ζM are positive constants.
(2) The optimal weight vectors w and v are bounded, i.e., ‖w‖<wM, ‖v‖<vM, where wM, vM are positive constants.
Owing to ξck, ξak, φ(k+1) are only related to the weight w and the activation function ϑ, ξck and ξak are defined in the process later. Based on Assumption 3, it is certain that ξck, ξak, φ(k+1) are bounded. For simple representation, we apply ξcM, ξaM, φM represent the upper of ξck, ξak, φ(k+1) respectively.
Defined the weight estimation errors of the action and critic networks as ˜wa=ˆwa−wa and ˜wc=ˆwc−wc respectively, which ˆw represents the estimation weight and w denotes the optimal weight.
Theorem 4.1. Supposed that Assumptions 1–3 hold and the triggering condition is determined by (3.1). The weight-updating laws of NNs are regulated by (4.7) and (4.10) respectively. Then the system states x(k) and the weight estimation errors ˜wc and ˜wa are uniformly ultimately bounded (UUB) under the following conditions:
ηc<12ϑcM,ηa<12ϑaM,‖x(k)‖>√D2Mλmin(Q)(α−2)C2, | (4.12) |
where D2M=F2M+(1+2ηcϑ2cM)ζ2cM.
Proof. The situation that the event is triggered at the time k only needs to be considered. Because when the event is not triggered, control law u(k) is not updated, then the associated weight vectors wc and wa keep unchanged. Therefore, the Lyapunov function is only related to the system state. According to Theorem 3.1, stability of the system can be guaranteed for all k.
Define the Lyapunov function in the following form:
V(x(k+1))=xT(k)x(k)+1ηctr{˜wTc˜wc}+1ηatr{˜wTa˜wa}. | (4.13) |
Let L1=xT(k)x(k), L2=1ηc{˜wTc˜wc}, L3=1ηa{˜wTa˜wa}. The first-order difference of L1 has been discussed in Theorem 3.1.
Based on the weight updating law, the weight estimation error of the critic network can be deduced as
˜wc(k+1)=ˆwc(k+1)−wc=ˆwc(k)−ηc∂Ec∂ˆwck−wc=˜wc(k)−ηcϑc(zc(k+1))ec. | (4.14) |
Then the first-order difference of L2 can be denoted as
ΔL2=1ηctr{˜wTc(k+1)˜wc(k+1)−˜wTc(k)˜wc(k)}=1ηctr{[˜wc(k)−ηcϑc(zc(k+1))ec]T[˜wc(k)−ηcϑc(zc(k+1))ec]−˜wTc(k)˜wc(k)}=1ηctr{−2ηc˜wTc(k)ϑc(zc(k+1))ec+η2ceTcϑTc(zc(k+1))ϑc(zc(k+1))ec}=tr{−2˜wTc(k)ϑc(zc(k+1))ec+ηceTcϑTc(zc(k+1))ϑc(zc(k+1))ec}. | (4.15) |
The error function of the critic network is ec=˜wTcϑc(zc(k+1))−ζck, let ξck=˜wTcϑc(zc(k+1)). Substituting them into the above formula and using the Cauchy-Schwartz inequality, Eq (4.15) can be further derived as
ΔL2≤2ηc‖ξckϑc(zc(k+1))‖2+2ηc‖ζckϑc(zc(k+1))‖2−2‖ξck‖2+tr{2ξckζck}≤2ηc‖ξckϑc(zc(k+1))‖2+2ηc‖ζckϑc(zc(k+1))‖2−‖ξck‖2+‖ζck‖2≤−(1−2ηc‖ϑc(zc(k+1))‖2)‖ξck‖2+(1+2ηc‖ϑc(zc(k+1))‖2)‖ζck‖2≤−(1−2ηcϑ2cM)ξ2cM+(1+2ηcϑ2cM)ζ2cM. | (4.16) |
The weight estimation error of the action network can be described as
˜wa(k+1)=˜wa(k)−ηaϑa(ςak)ˆwTcϑc(zc(k+1))ˆwTcφ(k+1)ϖ. | (4.17) |
The first-order difference of L3 can be denoted as
ΔL3=1ηatr{˜wTa(k+1)˜wa(k+1)−˜wTa(k)˜wa(k)}=1ηatr{‖˜wa(k)−ηaϑa(ςak)ˆwTcϑc(zc(k+1))ˆwTcφ(k+1)ϖ‖2−‖˜wa(k)‖2}≤tr{−2˜wa(k)ϑa(ςak)ˆwTcϑc(zc(k+1))ˆwTcφ(k+1)ϖ}+ηa‖ϑa(ςak)ˆwTcϑc(zc(k+1))ˆwTcφ(k+1)ϖ‖2. | (4.18) |
Define ξak=˜wTaϑa(ςak), Ξ1=tr{−2˜wa(k)ϑa(zak)ˆwTcϑc(zc(k+1))ˆwTcφ(k+1)ϖ} and Ξ2=ηa‖ϑa(ςak)ˆwTcϑc(zc(k+1))ˆwTcφ(k+1)ϖ‖2, then we can easily deduce that
Ξ1=‖ˆwTcϑc(zc(k+1))−ˆwTcφ(k+1)ϖξak‖2−‖ˆwTcφ(k+1)ϖξak‖2−‖ˆwTcϑc(zc(k+1))‖2≤‖ˆwTcφ(k+1)ϖ‖2‖ξak‖2+‖ˆwTcϑc(zc(k+1))‖2≤12‖ξak‖4+12‖ˆwTcφ(k+1)ϖ‖4+‖ˆwTcϑc(zc(k+1))‖2, | (4.19) |
Ξ2≤−[‖ˆwTcφ(k+1)ϖ‖2−ηa‖ϑa(ςak)ˆwTcφ(k+1)ϖ‖2]‖ˆwTcϑc(zc(k+1))‖2+‖ˆwTcφ(k+1)ϖ‖2‖ˆwTcϑc(zc(k+1))‖2≤−[1−ηa‖ϑa(ςak)‖2]‖ˆwTcφ(k+1)ϖ‖2‖ˆwTcϑc(zc(k+1))‖2+12‖ˆwTcϑc(zc(k+1))‖4+12‖ˆwTcφ(k+1)ϖ‖4. | (4.20) |
Combined (4.18) and (4.19) with (4.20), ΔL3 satisfies
ΔL3≤−[1−ηa‖ϑa(ςak)‖2]‖ˆwTcφ(k+1)ϖ‖2‖ˆwTcϑc(zc(k+1))‖2+F2≤−[1−ηaϑ2aM]w4cMφ2Mϖ2ϑ2cM+F2M, | (4.21) |
where F2M defines as
F2=12‖ξak‖4+‖ˆwTcφ(k+1)ϖ‖4+‖ˆwTcϑc(zc(k+1))‖2+12‖ˆwTcϑc(zc(k+1))‖4≤12ξ4aM+w4cMφ4Mϖ4M+w2cMϑ2cM+12w4cMϑ4cM=F2M. | (4.22) |
Based on (3.9), (4.16) and (4.21), we can conclude that
ΔL≤−λmin(Q)(α−2)‖x(k)‖2−(1−2ηcϑ2cM)ξ2cM−[1−ηaϑ2aM]w4cMφ2Mϖ2Mϑ2cM+D2M. | (4.23) |
According to (4.12), then the derivative of V is negative.
Remark 4. In this section, it is proved that the system states and the estimation errors of the neural network weights are uniformly ultimately bounded (UUB). It implies that the cost function and control law can converge to the neighborhoods of the optimal. The convergence of the system is demonstrated theoretically. Hence, the proposed method in this paper is more effective.
Example 1. Consider the following mass-spring-damper system [23]:
{˙x1=x2,˙x2=−bmx2−kmx1+Fm, | (5.1) |
where m=1kg and b=3N·s/m are mass and the drag force of the body. k=9N/m is the linear spring constant. The control law u(k) of the system is the force F from outside. Choose the initial state vector as x(0)=[−0.5,0.5]T, the constants α=2.5 and C=0.3. Based on the Euler method, the system can be discrete as
{x1(k+1)=0.0099x2(k)+0.9996x1(k),x2(k+1)=−0.0887x1(k)+0.97x2(k)+0.0099u(k). | (5.2) |
Set the control constraints as |uj|<0.1. Considering that u is one-dimensional, then the control constraint is designed as |u|<0.1. Let the parameters Q=I2 and R=I, which I2 and I represent the identity matrix with appropriate dimensions. We choose three-layer neural networks to implement the algorithm. For model networks, 500 data samples are used to train and another 500 samples to test its performance. Then we train the critic network and action network for 500 iterations to make sure the given accuracy ε=10−5 is reached. In the training process, the learning rate is ηm=ηc=ηa=0.05.
Moreover, in order to make comparisons with the event-triggered HDP algorithm proposed in [23], we also present the controller designed by the event-based HDP algorithm. Then, we apply the optimal control laws designed by event-based ADDHP and HDP techniques to the system 500 times, respectively. The state curves by using these two methods are shown in Figures 2 and 3. It is evident that the proposed method converges faster and performs better than the HDP algorithm based on the event-triggered control. The corresponding control curves are shown in Figure 4. Apparently, the control law is updated only when the triggering condition is violated. As displayed in Figure 4, it can be seen from the simulation results that the controller derived by the event-based ADDHP algorithm can reduce the number of controller updates and converge faster while ensuring system performance.
Example 2. Consider the discrete-time nonlinear system:
{x1(k+1)=x1(k)+0.1x2(k),x2(k+1)=−0.17sin(x1(k))+0.98x2(k)+0.1u1(k),x3(k+1)=0.1x1(k)+0.2x2(k)+x3(k)cos(u2(k)), | (5.3) |
where the state vector is x(k)=[x1(k),x2(k),x3(k)]T and the control input is u(k)=[u1(k),u2(k)]T. The weight matrices of the utility function are set as Q=I3,R=0.01I2. The constraint boundary is set as 3.
The learning rates and other relevant parameters of the model component are chosen the same as Example 1, but with the structure 5-8-3. We apply the developed algorithm to train the critic network (5-8-3) and the action network (3-8-2). The initial weights of these two networks are selected the same as that in Example 1. Here, the initial state vector is chosen as x(k)=[0.5,0.5,0.5]T. For adopting the event-based mechanism, we set the parameters of the threshold as α=3,C=0.2. Then, the state trajectories of the developed method and the event-triggered HDP algorithm are shown in Figures 5–7. The corresponding control curves are shown in Figures 8 and 9. Remarkably, an evident improvement of the resource utilization has been obtained under event-driven formulation. From these results, we observe that the system performance can be maintained while the control efficiency has been signally enhanced, which demonstrates the effectiveness of the event-driven ADDHP approach. Moreover, the convergence rate of the system is faster than the event-driven HDP algorithm.
In this paper, a novel event-triggered control method has been studied for discrete-time nonlinear systems with constrained input. A novel triggering condition is designed with a simpler form and fewer assumptions. Moreover, it also proves that the states and the estimation errors of the neural network weights are uniformly ultimately bounded. The simulation example emphasizes that the proposed method can cut computational burden while ensuring system performance. However, due to the complexity of the actual system, the full state feedback is infeasible. Therefore, other feedback control methods will need to be further studied in the future.
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
The authors declare that they have no conflicts of interest.
[1] |
D. Liu, S. Xue, B. Zhao, B. Luo, Q. Wei, Adaptive dynamic programming for control: a survey and recent advances, IEEE T. Syst. Man Cy., 51 (2021), 142–160. https://doi.org/10.1109/TSMC.2020.3042876 doi: 10.1109/TSMC.2020.3042876
![]() |
[2] |
Y. Zhang, B. Zhao, D. Liu, Deterministic policy gradient adaptive dynamic programming for model-free optimal control, Neurocomputing, 387 (2020), 40–50. https://doi.org/10.1016/j.neucom.2019.11.032 doi: 10.1016/j.neucom.2019.11.032
![]() |
[3] | M. Ha, D. Wang, D. Liu, A novel value iteration scheme with adjustable convergence rate, IEEE T. Neur. Net. Lear., in press. https://doi.org/10.1109/TNNLS.2022.3143527 |
[4] |
C. Mu, D. Wang, H. He, Novel iterative neural dynamic programming for data-based approximate optimal control design, Automatica, 81 (2017), 240–252. https://doi.org/10.1016/j.automatica.2017.03.022 doi: 10.1016/j.automatica.2017.03.022
![]() |
[5] |
L. Dong, X. Zhong, C. Sun, H. He, Adaptive event-triggered control based on heuristic dynamic programming for nonlinear discrete-time systems, IEEE T. Neur. Net. Lear., 28 (2017), 1594–1605. https://doi.org/10.1109/TNNLS.2016.2541020 doi: 10.1109/TNNLS.2016.2541020
![]() |
[6] |
T. Li, D. Yang, X. Xie, H. Zhang, Event-triggered control of nonlinear discrete-time system with unknown dynamics based on HDP (λ), IEEE T. Cybernetics, 52 (2021), 6046–6058. https://doi.org/10.1109/TCYB.2020.3044595 doi: 10.1109/TCYB.2020.3044595
![]() |
[7] |
J. Lu, Q. Wei, T. Zhou, Z. Wang, F. Wang, Event-triggered near-optimal control for unknown discrete-time nonlinear systems using parallel control, IEEE T. Cybernetics, 53 (2023), 1890–1904. https://doi.org/10.1109/TCYB.2022.3164977 doi: 10.1109/TCYB.2022.3164977
![]() |
[8] |
J. Wang, Y. Wang, Z. Ji, Model-free event-triggered optimal control with performance guarantees via goal representation heuristic dynamic programming, Nonlinear Dyn., 108 (2022), 3711–3726. https://doi.org/10.1007/s11071-022-07438-y doi: 10.1007/s11071-022-07438-y
![]() |
[9] | Z. Wang, J. Lee, X. Sun, Y. Chai, Y. Liu, Self-learning optimal control with performance analysis using event-triggered adaptive dynamic programming, Proceedings of 5th International Conference on Crowd Science and Engineering, 2021, 29–34. https://doi.org/10.1145/3503181.3503187 |
[10] |
S. Xue, B. Luo, D. Liu, Y. Gao, Event-triggered ADP for tracking control of partially unknown constrained uncertain systems, IEEE T. Cybernetics, 52 (2022), 9001–9012. https://doi.org/10.1109/TCYB.2021.3054626 doi: 10.1109/TCYB.2021.3054626
![]() |
[11] |
D. Wang, M. Zhao, M. Ha, J. Ren, Neural optimal tracking control of constrained nonaffine systems with a wastewater treatment application, Neural Networks, 143 (2021), 121–132. https://doi.org/10.1016/j.neunet.2021.05.027 doi: 10.1016/j.neunet.2021.05.027
![]() |
[12] |
J. Lu, Q. Wei, Y. Liu, T. Zhou, F. Wang, Event-triggered optimal parallel tracking control for discrete-time nonlinear systems, IEEE T. Syst. Man Cy., 52 (2022), 3772–3784. https://doi.org/10.1109/TSMC.2021.3073429 doi: 10.1109/TSMC.2021.3073429
![]() |
[13] |
K. Wang, Q. Gu, B. Huang, Q. Wei, T. Zhou, Adaptive event-triggered near-optimal tracking control for unknown continuous-time nonlinear systems, IEEE Access, 10 (2022), 9506–9518. https://doi.org/10.1109/ACCESS.2021.3140076 doi: 10.1109/ACCESS.2021.3140076
![]() |
[14] |
Q. Wei, J. Lu, T. Zhou, X. Cheng, F. Wang, Event-triggered near-optimal control of discrete-time constrained nonlinear systems with application to a boiler-turbine system, IEEE T. Ind. Inform., 18 (2022), 3926–3935. https://doi.org/10.1109/TII.2021.3116084 doi: 10.1109/TII.2021.3116084
![]() |
[15] | D. Wang, L. Hu, M. Zhao, J. Qiao, Adaptive critic for event-triggered unknown nonlinear optimal tracking design with wastewater treatment applications, IEEE T. Neur. Net. Lear., in press. https://doi.org/10.1109/TNNLS.2021.3135405 |
[16] |
B. Sun, E. van Kampen, Event-triggered constrained control using explainable global dual heuristic programming for nonlinear discrete-time systems, Neurocomputing, 468 (2022), 452–463. https://doi.org/10.1016/j.neucom.2021.10.046 doi: 10.1016/j.neucom.2021.10.046
![]() |
[17] |
S. Xue, B. Luo, D. Liu, Y. Li, Adaptive dynamic programming based event-triggered control for unknown continuous-time nonlinear systems with input constraints, Neurocomputing, 396 (2020), 191–200. https://doi.org/10.1016/j.neucom.2018.09.097 doi: 10.1016/j.neucom.2018.09.097
![]() |
[18] |
S. Zhang, B. Zhao, Y. Zhang, Event-triggered control for input constrained non-affine nonlinear systems based on neuro-dynamic programming, Neurocomputing, 440 (2021), 175–184. https://doi.org/10.1016/j.neucom.2021.01.116 doi: 10.1016/j.neucom.2021.01.116
![]() |
[19] |
X. Yang, Q. Wei, Adaptive critic learning for constrained optimal event-triggered control with discounted cost, IEEE T. Neur. Net. Lear., 32 (2021), 91–104. https://doi.org/10.1109/TNNLS.2020.2976787 doi: 10.1109/TNNLS.2020.2976787
![]() |
[20] |
M. Ha, D. Wang, D. Liu, Event-triggered adaptive critic control design for discrete-time constrained nonlinear systems, IEEE T. Syst. Man Cy., 50 (2020), 3158–3168. https://doi.org/10.1109/TSMC.2018.2868510 doi: 10.1109/TSMC.2018.2868510
![]() |
[21] | M. Ha, D. Wang, D. Liu, B. Zhao, Adaptive event-based control for discrete-time nonaffine systems with constrained inputs, Proceedings of Eighth International Conference on Information Science and Technology (ICIST), 2018,104–109. https://doi.org/10.1109/ICIST.2018.8426093 |
[22] |
B. Luo, Y. Yang, D. Liu, H. Wu, Event-triggered optimal control with performance guarantees using adaptive dynamic programming, IEEE T. Neur. Net. Lear., 31 (2020), 76–88. https://doi.org/10.1109/TNNLS.2019.2899594 doi: 10.1109/TNNLS.2019.2899594
![]() |
[23] |
Z. Wang, Q. Wei, D. Liu, A novel triggering condition of event-triggered control based on heuristic dynamic programming for discrete-time systems, Optim. Contr. Appl. Meth., 39 (2018), 1467–1478. https://doi.org/10.1002/oca.2421 doi: 10.1002/oca.2421
![]() |
[24] |
C. Mu, K. Liao, K. Wang, Event-triggered design for discrete-time nonlinear systems with control constraints, Nonlinear Dyn., 103 (2021), 2645–2657. https://doi.org/10.1007/s11071-021-06218-4 doi: 10.1007/s11071-021-06218-4
![]() |
[25] |
M. Ha, D. Wang, D. Liu, Event-triggered constrained control with dhp implementation for nonaffine discrete-time systems, Inform. Sciences, 519 (2020), 110–123. https://doi.org/10.1016/j.ins.2020.01.020 doi: 10.1016/j.ins.2020.01.020
![]() |
1. | Jiaojiao Li, Yingying Wang, Jianyu Zhang, Event-triggered sliding mode control for a class of uncertain switching systems, 2023, 8, 2473-6988, 29424, 10.3934/math.20231506 |