
We investigated the challenge of unconstrained distributed optimization with a time-varying objective function, employing a prediction-correction approach. Our method introduced a backward Euler prediction step that used the differential information from consecutive moments to forecast the trajectory's future direction. This predicted value was then refined through an iterative correction process. Our analysis and experimental results demonstrated that this approach effectively addresses the optimization problem without requiring the computation of the Hessian matrix's inverse.
Citation: Zhuo Sun, Huaiming Zhu, Haotian Xu. Distributed Newton method for time-varying convex optimization with backward Euler prediction[J]. AIMS Mathematics, 2024, 9(10): 27272-27292. doi: 10.3934/math.20241325
[1] | Chunjuan Hou, Zuliang Lu, Xuejiao Chen, Fei Huang . Error estimates of variational discretization for semilinear parabolic optimal control problems. AIMS Mathematics, 2021, 6(1): 772-793. doi: 10.3934/math.2021047 |
[2] | Zuliang Lu, Ruixiang Xu, Chunjuan Hou, Lu Xing . A priori error estimates of finite volume element method for bilinear parabolic optimal control problem. AIMS Mathematics, 2023, 8(8): 19374-19390. doi: 10.3934/math.2023988 |
[3] | Feng Ma, Bangjie Li, Zeyan Wang, Yaxiong Li, Lefei Pan . A prediction-correction based proximal method for monotone variational inequalities with linear constraints. AIMS Mathematics, 2023, 8(8): 18295-18313. doi: 10.3934/math.2023930 |
[4] | Eunjung Lee, Dojin Kim . Stability analysis of the implicit finite difference schemes for nonlinear Schrödinger equation. AIMS Mathematics, 2022, 7(9): 16349-16365. doi: 10.3934/math.2022893 |
[5] | Narcisse Batangouna . A robust family of exponential attractors for a time semi-discretization of the Ginzburg-Landau equation. AIMS Mathematics, 2022, 7(1): 1399-1415. doi: 10.3934/math.2022082 |
[6] | Xiaolong Chen, Hongfeng Zhang, Cora Un In Wong . Optimization study of tourism total revenue prediction model based on the Grey Markov chain: a case study of Macau. AIMS Mathematics, 2024, 9(6): 16187-16202. doi: 10.3934/math.2024783 |
[7] | Ruby, Vembu Shanthi, Higinio Ramos . A numerical approach to approximate the solution of a quasilinear singularly perturbed parabolic convection diffusion problem having a non-smooth source term. AIMS Mathematics, 2025, 10(3): 6827-6852. doi: 10.3934/math.2025313 |
[8] | Danuruj Songsanga, Parinya Sa Ngiamsunthorn . Single-step and multi-step methods for Caputo fractional-order differential equations with arbitrary kernels. AIMS Mathematics, 2022, 7(8): 15002-15028. doi: 10.3934/math.2022822 |
[9] | Mingyuan Cao, Yueting Yang, Chaoqian Li, Xiaowei Jiang . An accelerated conjugate gradient method for the Z-eigenvalues of symmetric tensors. AIMS Mathematics, 2023, 8(7): 15008-15023. doi: 10.3934/math.2023766 |
[10] | Zhenhua Su, Zikai Tang . Extremal unicyclic and bicyclic graphs of the Euler Sombor index. AIMS Mathematics, 2025, 10(3): 6338-6354. doi: 10.3934/math.2025289 |
We investigated the challenge of unconstrained distributed optimization with a time-varying objective function, employing a prediction-correction approach. Our method introduced a backward Euler prediction step that used the differential information from consecutive moments to forecast the trajectory's future direction. This predicted value was then refined through an iterative correction process. Our analysis and experimental results demonstrated that this approach effectively addresses the optimization problem without requiring the computation of the Hessian matrix's inverse.
Time-varying distributed optimization (TVDO) has gained increasing attention due to its practical advantages over time-invariant distributed optimization (TIDO) in dynamic environments where the objective function or constraints may evolve over time. TVDO has been applied in various fields, including power systems [1], traffic system [2], and robotics [3], with further applications in energy management [4]. In TVDO, the optimal solution changes over time, making traditional algorithms designed for TIDO, which aim to reach a static optimizer, unsuitable for direct application. To address TVDO, several discrete-time algorithms (DTAs) have been developed [5,6,7,8]. For instance, prediction-correction methods are employed in [6,7] to solve TVDO with a specific sampling period, where the tracking error is linked to the size of the sampling period. A comprehensive review of DTAs and related work can be found in the survey [9]. However, due to factors like sampling period, step size, or errors in local optimization, continuous-time algorithms (CTAs) generally face challenges in asymptotically tracking the time-varying optimal trajectory.
Given the extensive use of digital computing and sensing units [10,11], we focus on a discrete-time framework. Specifically, we sample the objective functions F(x;t) at discrete time instances tk, where k=0,1,2,… and the sampling period is defined as h≜tk+1−tk. Instead of addressing the time-varying problem directly, we solve a series of time-invariant problems. If the sampling period h is made sufficiently small, the resulting solution trajectory F(tk) can be obtained with high accuracy. However, in most practical applications, solving these problems for each time sample is impractical, as the computation time required to find each optimizer often surpasses the rate at which the solution trajectory evolves, unless F(t) remains nearly stationary.
The batch method is one traditional approach for addressing the problem sequence, where the objective function is sampled at specified intervals, solving each resultant static problem within given periods. However, this method often fails to align with real-time processing demands, particularly when computational resources are limited and sampling intervals are short. This limitation becomes more pronounced with larger problem sizes, preventing convergence within the required timeframe [12]. Consequently, attention has shifted towards online optimization techniques. These methods update the optimization problem continuously throughout the algorithm's iterations, allowing for the extraction of suboptimal solutions at any stage, regardless of convergence status [1,13]. Over time, these solutions increasingly approximate the optimal solution. In pursuit of improving this approach, various strategies have been developed. For example, a method based solely on correction operations was introduced, achieving an asymptotic error bound on the order of O(h) [14]. Another strategy involves a prediction-correction algorithm tailored for time-varying parameter optimization, though it requires an initial optimal solution, limiting its practical application [15]. Nonetheless, these methods have facilitated some theoretical advances in reducing the computational load, especially in convex optimization scenarios. Further, the interior point method has been applied to solve constrained convex optimization problems characterized by time-varying elements, utilizing a log-barrier penalty function and a dynamic system comprising predictive and corrective components based on Newton's method [16]. For unconstrained time-varying optimization, a suite of algorithms deploying prediction-correction strategies has been proposed. These include Gradient Trajectory Tracking (GTT) and Approximate Gradient Tracking (AGT), which achieve an O(h2) asymptotic error range. Moreover, Newton Trajectory Tracking (NTT) and Approximate Newton Tracking (ANT) are shown to offer superior error bounds [10]. Building on this framework, a decentralized prediction-correction algorithm for time-varying network optimization challenges has been developed, employing novel matrix splitting techniques and approximating matrix inverses using the Taylor series [6,17].
Prediction-correction algorithms [18], utilizing nonstationary optimization techniques [15,19,20], have been developed to iteratively solve convex programs that change over time. These algorithms function by predicting, at time tk, the optimal solution for the next time step tk+1, using an approximation of how the objective function F varies during this interval. The initial prediction is then refined through gradient or Newton descent methods. However, these algorithms are tailored for centralized systems. In our work, we address time-varying convex programs in decentralized environments, where nodes can communicate only with their immediate neighbors. Therefore, the prediction-correction methods proposed in [18] are not directly applicable in this decentralized context.
In the discussed algorithms that incorporate a prediction step, while achieving tighter error bounds, these methods necessitate the computation or approximation of the inverse of the Hessian matrix. In certain scenarios, this process also involves calculating mixed partial derivatives. It is established that computing the inverse of a matrix is computationally intensive, with the complexity typically represented as O(p3), where p is the matrix's dimension. This complexity escalates further for objective functions that are more intricate due to the requirement of mixed partial derivative calculations. In this paper, we outline our major contributions as follows:
1) We introduce a backward Euler prediction step tailored for the unconstrained time-varying optimization problem. This approach eliminates the need for calculating matrix inverses and computing mixed partial derivatives, resulting in reduced computational complexity. The corrective aspect of the algorithm is derived from a Newton step.
2) We establish the convergence of our proposed algorithm and detail its convergence rate. The asymptotic error associated with our algorithm depends on the sampling period h, ranging from a worst-case scenario of O(h2) to an optimal scenario of O(h4).
The structure of this paper is as follows: In Section Ⅱ, we provide the mathematical foundations essential for the development of the major results. In Section Ⅲ, we detail the algorithms incorporating the backward Euler prediction step. The performance of these algorithms is examined in Section Ⅳ. A numerical example demonstrating the practical application of our theoretical findings is presented in Section Ⅴ. We conclude in Section VI.
We focus on a connected, undirected graph G=(V,E), where the vertex set V consists of n nodes, and the edge set E consists of m edges. Distributed optimization algorithms are employed to address the problem of minimizing a global smooth, strongly convex cost function across a set of nodes, where the objective function is expressed as a sum of local functions. We intend to solve the following problem raised in [6]
x∗(t):=argminx∈RnpF(x;t):=f(x;t)+g(x;t),t≥0, | (2.1) |
where x∈Rnp represents the stacked vector of decision variables for all nodes, and xi∈Rp denotes the decision variable for node i, and x=(x1T;…;xnT)T. The function \(f(\mathbf{x}; t) \colon = \sum\limits_{i \in V} f^i(\mathbf{x}^i; t) \) is the sum of the locally available functions at each node, while \(g(\mathbf{x}; t) \colon = \sum\limits_{i \in V} g^{i, i}(\mathbf{x}^i; t) + \sum\limits_{(i, j) \in E} g^{i, j}(\mathbf{x}^i, \mathbf{x}^j; t) \) incorporates terms induced by the network structure G, necessitating coordination and information exchange across the network. Before introducing distributed protocols to solve the problem in (2.1), we present an example to clarify the problem setting.
For example, in a wireless sensor network, each sensor node must adhere to channel capacity and interference constraints, which can be formulated as a resource allocation problem. The time-varying utility function for sensor i is defined as fi:Rp×R+→R, with the decision variable xi∈Rp representing the resources allocated to node i in a network G consisting of n sensors. This leads to the formulation of the time-varying resource allocation problem.
minxi∈Rp,…xn∈Rp∑i∈Vfi(xi;t)subject toAx=b(t), | (2.2) |
where the matrix A∈Rlp×np is the augmented graph edge incidence matrix. This matrix is composed of l×n square blocks, each of dimension p. For an edge e=(j,k), where j<k, that connects node j to node k, the block (e,j) in A is given by [A]ej=Ip, and the block [A]ek=−Ip, where Ip denotes the identity matrix of dimension p. All other blocks in A are zeros. The time-varying vectors b(t)∈Rlp are determined by channel capacity and rate transmission constraints.
In our work, we consider the nodes to be consumer devices aiming to solve decentralized approximations of the problem defined in (2.2). Specifically, we explore the approximate augmented Lagrangian relaxation of (2.2), and solve
minxi∈Rp,…xn∈Rp∑i∈Vfi(xi;t)+1β2‖Ax−b(t)‖2, | (2.3) |
where the parameter β>0 acts as a regularization term, encouraging consistency among all nodes. In (2.3), the matrix A∈Rlp×np is the graph edge incidence matrix. It is straightforward to observe that the first term in (2.3) is identical to the first term in (2.1). Moreover, given the definition of the augmented graph edge incidence matrix A, we can simplify the second term in (2.3) to ‖(xi−xj)−bi(t)‖2, which corresponds to the functions gi,j(xi,xj;t) in (2.1).
Notation: Let ‖⋅‖ represent the Euclidean norm. The gradient of the function F(x;t) with respect to x is denoted by ∇xF(x;t). The partial derivatives of ∇xF(x;t) with respect to x and t are indicated by ∇xxF(x;t) and ∇txF(x;t), respectively. The third-order derivative of F(x;t) with respect to x is denoted as (∇xxxF(x;t). The time derivative of the Hessian of F(x;t) with respect to t is expressed as ∇txxF(x;t)=∇xtxF(x;t). Finally, the second derivative of ∇xF(x;t) with respect to t is represented by ∇ttxF(x;t).
If x∗ represents the optimal solution for the objective function F(x;t), then the gradient of F(x;t) with respect to x must satisfy
∇xF(x∗;t)≡0,∀t≥0. | (3.1) |
As a consequence, the time derivative of this expression must also be zero, leading to
d∇xF(x∗;t)dt=∇xxF(x∗;t)˙x∗(t)+∇txF(x∗;t)=0, | (3.2) |
where ˙x∗(t) represents the time derivative of x∗(t). If the Hessian of F(x;t) is invertible, we can deduce from the above equation that
˙x∗(t)=−∇xxF(x∗;t)−1∇txF(x∗;t). | (3.3) |
We can apply either the gradient descent method or the Newton method to optimize F(x;t), leading to the formulation of a continuous-time dynamical system as follows:
˙x(t)=−γ∇xxF(x;t)−1∇xF(x;t), | (3.4) |
where γ represents a control gain with the constraint 0<γ≤1. When γ=α∇xxF(x;t), where α>0 is a constant, this corresponds to the gradient term in Eq (8). Conversely, when γ=1, Eq (8) represents the Newton term. The trajectory x(t) generated by Eq (8) will approach a vicinity of x∗(t), but due to the time-varying nature of the solution, it does not converge precisely to x∗(t) [21].
If the optimal solution x∗(t) was known for some t0≥0, the system (3.3) could be used to track the evolution of x∗(t), since (3.3) guarantees the optimality condition ∇xF(x∗;t)=0 for all t≥t0. If x∗(t) is not accessible at any time, we can combine the dynamics (3.3) and (3.4) and devise the following dynamical system:
˙x(t)=F(x;t), | (3.5) |
where F(x;t) is defined as
F(x;t)=−∇xxF(x;t)−1∇txF(x;t)−γ∇xxF(x;t)−1∇xF(x;t). | (3.6) |
The dynamics described in (3.6) consist of two components: A prediction component and a correction component. The prediction component, given by −∇xxF(x;t)∇txF(x;t), aims to forecast the change in the optimal solution (refer to (3.3)). The correction component, −γ∇xxF−1(x;t)∇xF(x;t), serves to steer x(t) toward the optimum.
For any time interval [t0,t0+T], where 0<T<+∞, we can partition the interval into N sub-intervals such that tk+1=tk+h, for k=0,1,2,…,N−1, where h is the discretization step size. Let x(tk) represent the solution of F(x;t) at time tk. For simplicity, we may denote x(tk) as xk throughout this paper.
A numerical method for approximating (3.5) is considered a one-step method if, for all k≥0, xk+1 depends solely on xk. Two examples of such one-step methods are presented in [22].
(1) The forward Euler calculation method
xk+1≈xk+hF(xk,tk). | (3.7) |
(2) The backward Euler calculation method
xk≈xk+1−hF(xk+1,tk+1). | (3.8) |
The primary distinction between the forward and backward methods is found in their treatment of the first-order approximation term for F(x;t). It follows from (3.8) that
xk−1≈xk−hF(xk,tk), | (3.9) |
yielding
F(xk,tk)≈xk−xk−1h. | (3.10) |
Substituting (3.10) into (3.7) gives
xk+1≈xk+hxk−xk−1h=2xk−xk−1. | (3.11) |
According to (3.11), we formulate the prediction step as follows:
xk+1|k=2xk−xk−1. | (3.12) |
The prediction step can be computed only depend on the state information at time k and k−1. Compared with the prediction step in [6,17], the prediction step in our algorithm can reduce the computation complexity significantly.
To address the time-varying optimization problem in (2.1), we sample the continuous time-varying objective function F(x;t) at discrete times tk, where k=0,1,2,…. This transformation turns the continuous time-varying optimization problem into a sequence of time-invariant optimization problems.
x∗(tk):=argminx∈RnpF(x;tk),k≥0. | (3.13) |
In network, the prediction step can be written as (3.12). For node i, the predicted variable xik+1|k is just upon the information of the node itself which given by
xik+1|k=2xik−xik−1, | (3.14) |
the xik+1|k is computed through local operations, but for the correction step in our algorithm, the Newton term involving the calculate of the inverse of the matrix, which requires the global communication. To develop a decentralized protocol to correct the prediction variable xk+1|k [18], here we apply the decentralized Newton method proposed by [6]. The decentralized Newton method is formulated as follow
xk+1=xk+1|k−γck+1,(K), | (3.15) |
where ck+1,(K)∈Rnp is called correction direction, its definition is
ck+1,(K)=D−12k+1|kK∑τ=0(D−12k+1|kBk+1|kD−12k+1|k)τD−12k+1|k×∇xF(xk+1|k;tk+1). | (3.16) |
For D−12k+1|kK∑τ=0(D−12k+1|kBk+1|kD−12k+1|k)τD−12k+1|k, it is the K-th order approximate of ∇xxF(xk+1|k;tk+1)−1, which is derived from truncations of the series [6,21]
∇xxF(xk+1|k;tk+1)−1=D−12k+1|k∞∑τ=0(D−12k+1|kBk+1|kD−12k+1|k)τD−12k+1|k, | (3.17) |
where matrices Dk+1|k and Bk+1|k are defined as
Dk+1|k:=∇xxf(xk+1|k;tk+1)+diag[∇xxg(xk+1|k;tk+1)],Bk+1|k:=diag[∇xxg(xk+1|k;tk+1)]−∇xxg(xk+1|k;tk+1), | (3.18) |
where the diag[∇xxg(xk+1|k;tk+1)] denotes the block diagonal matrix, which contains the diagonal blocks of the matrix ∇xxg(xk+1|k;tk+1). The matrix
Diik+1|k=∇xixifi(xik+1|k;tk+1)+∇xixigi,i(xik+1|k;tk+1)+∑j∈Ni∇xixigi,j(xik+1|k,xjk+1|k;tk+1) | (3.19) |
is computed at node i. The second term in (3.19) links the decisions of node i with those of its neighboring nodes j∈Ni. The structure of the matrix Bk+1|k is determined by the graph topology, with null diagonal blocks Bi,ik+1|k and non-zero off-diagonal blocks Bi,jk+1|k given by −∇xixjgi,j(xi,xj;t) whenever nodes i and j are connected. The calculation of Bijk+1|k is as follows:
Bijk+1|k=−∇xixjgi,j(xik+1|k,xjk+1|k;tk+1). | (3.20) |
Lemma 1. According to [21], as per the definition of the correction direction in (3.16) the sequence of correction directions satisfies
ck+1,(τ+1)=D−1k+1|k(Bk+1|kck+1,(τ)+∇xF(xk+1|k;tk+1)), | (3.21) |
define cik+1,(K) and ∇xFi(xk+1|k;tk+1) as the i-th component of the vector ck+1,(K) and ∇xF(xk+1|k;tk+1), the recursion of (3.21) can be decomposed into local components
cik+1,(τ+1)=−(Diik+1|k)−1(∑j∈NiBijk+1|kcjk+1,(τ)+∇xFi(xk+1|k;tk+1)). | (3.22) |
The gradient component in (3.22)
∇xFi(xk+1|k;tk+1)=∇xifi(xik+1|k;tk+1)+∇xigi,i(xik+1|k;tk+1)+∑j∈Ni∇xigi,j(xik+1|k,xjk+1|k;tk+1) | (3.23) |
is also computed at node i, Lemma 1 asserts that the component cik+1,(K) can indeed be calculated using local operations, which means that the iterative application of (3.15) can be executed in a distributed fashion. Consequently, node i computes xik+1 by implementing the following local descent:
xik+1=xik+1|k−γcik+1,(K). | (3.24) |
We call DeNSP as the decentralized Newton method with a backward Euler prediction that uses descent method [cf., (3.24)] and the prediction step [cf., (3.14)]. The algorithm is summarized as Algorithm 1.
Algorithm 1: Decentralized newton method with a simple prediction (DeNSP) at node i. |
Input: The local variable xik, the approximation level K, the step size γ. |
1: for tk(k=0,1,2,...) do |
2: Predict the next trajectory using the prior information |
3: if k>0 |
xik+1|k=2xik−xik−1 |
else |
xik+1|k=xi0 |
4: end if |
5: Initialize the sequence of corrected variables ˆxi,0k+1=xik+1|k |
6: Exchange the variable ˆxi,ηk+1 with neighbors j∈Ni |
7: Observe Fi(.;tk+1), compute ∇xFi(xk+1|k;tk+1) [cf., (3.23)] |
8: Compute matrices Diik+1|k and Bijk+1|k, j∈Ni as |
Diik+1|k:=∇xixifi(xik+1|k;tk+1) |
+∇xixigi,i(xik+1|k;tk+1) |
+∑j∈Ni∇xixigi,j(xik+1|k,xjk+1|k;tk+1) |
Bijk+1|k=−∇xixjgi,j(xik+1|k,xjk+1|k;tk+1) |
9: Compute: |
cik+1,(0)=−(Diik+1|k)−1∇xFi(xk+1|k;tk+1) |
10: for τ=0,1,2,...,K−1 do |
11: Exchange correction step cik,(τ) with neighboring nodes j∈Ni |
cik+1,(τ+1)=−(Diik+1|k)−1(∑j∈NiBijk+1|kcjk+1,(τ)+∇xFi(xk+1|k;tk+1)) |
12: end for |
13: Correct the trajectory prediction |
xik+1=ˆxik+1|k+γcik+1,(K′) |
14: end for |
15: Output: The corrected variable xik+1 |
In this section, we examine the convergence of the algorithms introduced in Section Ⅲ. Our convergence analysis demonstrates that the discrepancy between the optimal solution x∗(tk) and the computed solution xk is ultimately upper bounded. Specifically, the error is predominantly influenced by the sampling time h. Throughout this paper, we adopt the following assumption regarding the objective function:
Assumption 1. Each node's own function fi is twice differentiable, and the eigenvalues of the local Hessians matrix ∇xixifi(xi;t) are bounded with positive constants [m1,M1], i.e.,
m1I⪯∇xixifi(xi;t)⪯M1I. | (4.1) |
Assumption 2. Functions gi,i(xi;t) and gi,j(xi,xj;t) are twice differentiable and the eigenvalues of the aggregate function Hessian are bounded [0,L1], the constant L1<∞,
0⪯∇xxg(x;t)⪯L1I. | (4.2) |
Assumption 3. The function F(x;t) regarding all the derivatives of x∈Rnp and t≥0 are bounded,
‖∇txF(x;t)‖≤D0,‖∇xxxF(x;t)‖≤D1,‖∇xtxF(x;t)‖≤D2,‖∇ttxF(x;t)‖≤D3,‖∇xF‖≤D4. | (4.3) |
From the abounds of Hessians ∇xxf(x;t) and ∇xxg(x;t) in Assumptions 1 and 2 respectively, the Hessian of global cost ∇xxF(x;t) uniformly satisfies
m1I⪯∇xxF(x;t)⪯(L1+M1)I. | (4.4) |
These conditions ensure that the problem formulated in (2.1) is strongly convex, and they also ensure the invertibility of the Hessian matrix. To refine the contribution of our paper, we impose bounds on the higher-order derivatives of F, as stated in Assumption 3. A similar assumption has been utilized in previous works [6,10,18].
We turn to analyzing the DeNSP algorithm. As per the definition of Dk+1|k and Bk+1|k in (3.18), the matrix D−1/2k+1|kBk+1|kD−1/2k+1|k is positive semidefinite, and its eigenvalues are upper-bounded by a constant ρ<1, as shown in Proposition 2 of [17].
0⪯D−1/2k+1|kBk+1|kD−1/2k+1|k⪯ρI, | (4.5) |
where ρ=(1+2m1L1)−1<1.
In the following theorem, we establish that the sequence generated by the DeNSP algorithm asymptotically converges to a neighborhood of the optimal trajectory whose radius depends on the sampling period h.
Proposition 1. The norm of the difference between prediction xk+1|k and the optimal solution x∗k+1 is upper bounded by
‖xk+1|k−x∗k+1‖≤δ‖xk−x∗k‖+h2Δ, | (4.6) |
where δ:=1+h(D1(D0+γD4)m21+D2+γ(L1+M1)m1), and Δ=D1(D0+γD4)+(D0+γ(L1+M1))m31+(D0+γD4)(2D2+γm1)m21+γD0+D3m1.
Proof. See Supplementary A.
Theorem 1. Under Assumptions 1–3, fixing K as the level of Hessian inverse approximation for the correction step, there exist bounds ˉK, ˉh, and ˉR, such that if the sampling period h is selected to satisfy h≤ˉh, K is chosen to be K≥ˉK, and the initial optimality gap fulfills ‖x0−x∗(t0)‖≤ˉR, then xk+1 converges to the optimal trajectory x∗(tk+1) within a bounded error.
limk→∞sup‖xk+1−x∗(tk+1)‖=O(h2ρK+1(1−γ))+O(h4), | (4.7) |
specially, when approximation level K is chosen sufficiently large and γ=1, then the
limk→∞sup‖xk+1−x∗(tk+1)‖=O(h4). | (4.8) |
Proof. See Appendix B.
Theorem 1 indicates that the error bound described in (4.7) for the DeNSP algorithm is primarily influenced by the sampling period h. In the worst-case scenario, the asymptotic error floor is expected to be on the order of O(h2). However, according to the result limk→∞sup‖xk+1−x∗(tk+1)‖=O(h4), it is evident that under certain conditions, the error bound can be further improved. Specifically, if the approximation level K is chosen to be sufficiently large, the DeNSP algorithm has the potential to achieve a much tighter tracking performance, reducing the asymptotic error to the order of O(h4). This suggests that by carefully selecting the parameter K, the algorithm can provide significantly better accuracy in tracking the optimal solution over time. Meanwhile, reducing the sampling interval provides greater assistance in improving convergence accuracy. Furthermore, since the DeNSP algorithm omits the calculation of the Hessian matrix inverse in the prediction step, the computational complexity of the prediction step is O(p), while other algorithms such as DPC-N and DAPC-N require the calculation of the Hessian inverse in the prediction step, ignoring the complexity of gradient calculation, resulting in a computational complexity of O(p3). In the correction step, the computational complexity of several algorithms is consistent.
In this section, we provide a numerical example to demonstrate the effectiveness of the DeNSP algorithm. We consider a resource allocation problem in a network of interconnected devices, as discussed in Section Ⅱ.
We solve the problem described by (2.2) using an approximate augmented Lagrangian method, as outlined in (2.3). The local objective functions associated with sensor i are defined as follows:
fi(xi;t):=12(xi−cos(wt))2+kp∑l=1log[1+expbi,l(xi,l−d(t))]. | (5.1) |
In our simulation, we consider the case where decisions are the variables xi∈Rp, p=10, and set the scalar parameters as w=0.2π, k=0.1, bi,l∼U1[−2,2] and d(t)=cos(ωt). We consider there are n=50 nodes in a wireless network which the nodes are randomly distributed in the square [−1,1]2 and they can only communicate with each other if they are closer than a range of r=2.5√2/√n. Then the nodes generate a network with l links. We set the b=0 in (2.3) and the tuning parameter β=√20. We compare the DeNSP algorithm with DPC-N algorithm and DAPC-N algorithm mentioned in [6]. The constant step size of DPC-N and DAPC-N algorithm is set to γ=1.
From convergence result (4.7), it can be seen that the convergence accuracy is affected by the number of iterations and sampling intervals in the correction step. In order to compare the advantages with other algorithms, we set a fixed and commonly used sampling interval h=0.1. Further observe the impact of the number of iterations in the calibration process on convergence accuracy. In Figure 1, we run different algorithms and show the convergence performance between different algorithms. When the approximation level or the number of communication rounds K is set to 3, the tracking performance of these algorithms is comparable, showing little to no significant difference in their effectiveness. However, as K is increased to 5, the DeNSP algorithm demonstrates a clear advantage, achieving superior tracking accuracy and responsiveness. This suggests that the DeNSP algorithm benefits more from additional communication rounds, which enhances its ability to track the time-varying optimal solution more precisely, making it particularly effective in scenarios where higher precision is required. Further more, the DPC-N algorithm and the DAPC-N algorithm need to calculate of the inverse of the Hessian matrix, with the complexity typically represented as O(p3), where p is the matrix's dimension, and calculate mixed partial derivatives in the prediction step by communicating with neighbor nodes. In contrast, our algorithm can simply complete the prediction step using the previous information which can reduce computation time greatly.
Although the prediction correction algorithm proposed in this article simplifies the prediction steps, the correction steps still require a large amount of computation on Hessian matrix. Furthermore, for some optimization problems involving integer decision variables, the algorithm proposed in this paper is difficult to solve and requires further research.
In this paper, we propose the backward Euler prediction step for the problem of the unconstrained distributed optimization. Though the theoretical analysis, the convergence accuracy of the proposed algorithm can reach O(h2)∼O(h4). And compared with the DPC-N algorithm and DAPC-N algorithm, our algorithm need not compute the inverse of the Hessian of the cost function, which can save computing time. Finally, we verify the theoretical results via a numerical example. In future research, we will further simplify the calculation of the Hessian matrix in the calibration step.
Zhuo Sun: Conceptualization, Methodology, Writing-review & editing; Huaiming Zhu: Formal analysis, Investigation, Software, Writing-original draft; Haotian Xu: Investigation, Writing-review & editing. All authors have read and approved the final version of the manuscript for publication.
The authors declare no conflict of interest.
This work is supported in part by the National Natural Science Foundation of China (61304179,71431001,71831002,72211540399); the Humanity and Social Science Youth Foundation of the Ministry of Education (19YJC630151); the International Association of Maritime Universities (20200205AMC); the Natural Science Foundation of Liaoning Province (2020−HYLH−32); the Dalian Science and Technology Innovation Fund (2020JJ26GX023)
[1] |
J. S. Pan, A. Q. Tian, V. Snášel, L. Kong, S. C. Chu, Maximum power point tracking and parameter estimation for multiple-photovoltaic arrays based on enhanced pigeon-inspired optimization with taguchi method, Energy, 251 (2022), 123863. https://doi.org/10.1016/j.energy.2022.123863 doi: 10.1016/j.energy.2022.123863
![]() |
[2] |
A. Q. Tian, X. Y. Wang, H. Xu, J. S. Pan, V. Snášel, H. X. Lv, Multi-objective optimization model for railway heavy-haul traffic: Addressing carbon emissions reduction and transport efficiency improvement, Energy, 294 (2024), 130927. https://doi.org/10.1016/j.energy.2024.130927 doi: 10.1016/j.energy.2024.130927
![]() |
[3] |
A. Simonetto, E. Dall'Anese, S. Paternain, G. Leus, G. B. Giannakis, Time-varying convex optimization: Time-structured algorithms and applications, Proc. IEEE, 108 (2020), 2032–2048. http://dx.doi.org/10.1109/JPROC.2020.3003156 doi: 10.1109/JPROC.2020.3003156
![]() |
[4] |
Q. Li, Y. Liao, K. Wu, L. Zhang, J. Lin, M. Chen, J. M. Guerrero, et al., Parallel and distributed optimization method with constraint decomposition for energy management of microgrids, Proc. IEEE, 12 (2021), 4627–4640. http://dx.doi.org/10.1109/TSG.2021.3097047 doi: 10.1109/TSG.2021.3097047
![]() |
[5] |
S. Hosseini, A. Chapman, M. Mesbahi, Online distributed convex optimization on dynamic networks, IEEE Trans. Autom. Control, 61 (2016), 3545–3550. http://dx.doi.org/10.1109/TAC.2016.2525928 doi: 10.1109/TAC.2016.2525928
![]() |
[6] |
A. Simonetto, A. Koppel, A. Mokhtari, G. Leus, A. Ribeiro, Decentralized prediction-correction methods for networked time-varying convex optimization, IEEE Trans. Autom. Control, 62 (2017), 5724–5738. http://dx.doi.org/10.1109/TAC.2017.2694611 doi: 10.1109/TAC.2017.2694611
![]() |
[7] |
A. Simonetto, Dual prediction-correction methods for linearly constrained time-varying convex programs, IEEE Trans. Autom. Control, 64 (2018), 3355–3361. http://dx.doi.org/10.1109/TAC.2018.2877682 doi: 10.1109/TAC.2018.2877682
![]() |
[8] |
A. Q. Tian, F. F. Liu, H. X. Lv, Snow geese algorithm: A novel migration-inspired meta-heuristic algorithm for constrained engineering optimization problems, Appl. Math. Model., 126 (2024), 327–347. https://doi.org/10.1016/j.apm.2023.10.045 doi: 10.1016/j.apm.2023.10.045
![]() |
[9] | X. Li, L. Xie, N. Li, A survey on distributed online optimization and game, arXiv Prep., 2022. https://doi.org/10.48550/arXiv.2205.00473 |
[10] |
A. Simonetto, E. Dall'Anese, Prediction-correction algorithms for time-varying constrained optimization, IEEE Trans. Signal Process., 65 (2017), 5481–5494. http://dx.doi.org/10.1109/TSP.2017.2728498 doi: 10.1109/TSP.2017.2728498
![]() |
[11] |
S. Qu, Y. Zhou, Y. Ji, Z. Dai, Z. Wang, Robust maximum expert consensus modeling with dynamic feedback mechanism under uncertain environments, J. Ind. Manag. Optim., 12 (2024), 4627–4640. http://dx.doi.org/10.3934/jimo.2024093 doi: 10.3934/jimo.2024093
![]() |
[12] |
S. Bittanti, F. A. Cuzzola, A mixed gh2/h approach for stabilization and accurate trajectory tracking of unicycle-like vehicles, Int. J. Control, 74 (2001), 880–888. https://doi.org/10.1080/00207170110037164 doi: 10.1080/00207170110037164
![]() |
[13] | Y. Tang, Time-varying optimization and its application to power system operation, California Instit. Tech., (2019). http://dx.doi.org/10.7907/6N9W-3J20 |
[14] |
A. Y. Popkov, Gradient methods for nonstationary unconstrained optimization problems, Autom. Remote Control, 66 (2005), 883–891. https://doi.org/10.1007/s10513-005-0132-z doi: 10.1007/s10513-005-0132-z
![]() |
[15] |
A. L. Dontchev, M. I. Krastanov, R. T. Rockafellar, V. M. Veliov, An euler-newton continuation method for tracking solution trajectories of parametric variational inequalities, SIAM J. Control Optim., 51 (2013), 1823–1840. https://doi.org/10.1137/120876915 doi: 10.1137/120876915
![]() |
[16] |
M. Fazlyab, S. Paternain, V. M. Preciado, A. Ribeiro, Prediction-correction interior-point method for time-varying convex optimization, IEEE Trans. Autom. Control, 63 (2017), 1973–1986. http://dx.doi.org/10.1109/TAC.2017.2760256 doi: 10.1109/TAC.2017.2760256
![]() |
[17] | A. Mokhtari, Q. Ling, A. Ribeiro, Network newton-part i: Algorithm and convergence, 2015, arXiv Prep., (2015). https://doi.org/10.48550/arXiv.1504.06017 |
[18] |
A. Simonetto, A. Mokhtari, A. Koppel, G. Leus, A. Ribeiro, A class of prediction-correction methods for time-varying convex optimization, IEEE Trans. Signal Process., 64 (2016), 4576–4591. http://dx.doi.org/10.1109/TSP.2016.2568161 doi: 10.1109/TSP.2016.2568161
![]() |
[19] | P. Pedregal, Introduction to optimization, Springer, 2004. http://dx.doi.org/10.1007/b97412 |
[20] |
V. M. Zavala, M. Anitescu, Real-time nonlinear optimization as a generalized equation, SIAM J. Control Optim., 48 (2010), 5444–5467. https://doi.org/10.1137/090762634 doi: 10.1137/090762634
![]() |
[21] |
A. Mokhtari, Q. Ling, A. Ribeiro, Network newton distributed optimization methods, IEEE Trans. Signal Process., 65 (2017), 146–161. https://doi.org/10.1109/TSP.2016.2617829 doi: 10.1109/TSP.2016.2617829
![]() |
[22] |
Q. Alfio, S. Riccardo, S. Fausto, Numerical mathematics, Springer Sci. Busin. Media, 37 (2010). https://doi.org/10.1007/978-1-4612-4442-4 doi: 10.1007/978-1-4612-4442-4
![]() |