1.
Introduction
1.1. Background
This paper focuses on solving the problem of finding x∈C such that:
where C⊆Rn is a bounded, closed, and convex set, and the function F:C→Rn is both continuous and monotone. Specifically, the monotonicity condition of the function is defined by the following inequality:
The study of such a function plays a crucial role in various scientific fields, such as economic equilibrium systems, chemical equilibrium, and signal recovery [1,2,3,4]. Extensive researches have been devoted to developing numerical methods for solving this class of nonlinear systems of equations, leading to a variety of methods such as Newton's methods, quasi-Newton methods, Gauss-Newton methods, Levenberg-Marquardt methods, and so on [5,6,7,8,9].
Newton's methods and quasi-Newton methods have been extensively utilized to solve the problem (1.1) due to their straightforward implementation and rapid convergence properties. However, these methods necessitate the computation of the Jacobian matrix or its approximation value at every iteration. This can become inefficient when addressing large scale problems. As a result, conjugate gradient methods [10,11,12,13] have emerged as an effective numerical solution for large scale problems, primarily due to their lower memory requirements. In these conjugate gradient methods, the iterative sequence of points is defined as follows:
where xk+1 and xk are the next and current iteration points, respectively. The step size tk is determined through a line search technique, and dk denotes the search direction. Generally, the search direction dk is updated by using the following formula:
where F(xk) is abbreviated as Fk. Here, βk is a crucial parameter in the iterative process, reflecting the fundamental characteristics of conjugate gradient methods, and is commonly known as the conjugate parameter.
A key aspect of conjugate gradient methods ensures that the search direction exhibits sufficient descent and trust region properties, which are critical for guaranteeing global convergence. For example, Narushima et al. [14] developed a method capable of generating a descent direction and rigorously proves its global convergence under certain conditions. Moreover, Narushima et al. [15] proposed a novel conjugate gradient method that combines smoothing techniques with the Polak-Ribière-Polyak (PRP) method to address unconstrained non-smooth equations. Huang et al. [16] proposed a biased stochastic conjugate gradient algorithm for nonconvex problems, which integrates the stochastic recursive gradient method and the modified Barzilai-Borwein technique into the typical stochastic gradient algorithm. Jiang et al. [17] proposed two restart conjugate gradient methods for unconstrained optimization, employing different restart procedures. Their restart conditions are based on the Fletcher-Reeves and Dai-Yuan conjugate gradient parameters, with the objective of ensuring that the search directions maintain sufficient descent. In recent years, conjugate gradient methods for unconstrained optimization problems have been a widespread application in solving nonlinear systems of equations. For example, Cheng et al. [18] presented a PRP-type method designed for monotone equations, which effectively combines the PRP method from unconstrained optimization with hyperplane techniques. Subsequently, Yu et al. [19] introduced a spectral gradient method for large scale nonlinear systems of equations, together with the classical PRP method used in unconstrained minimization problems. Lastly, Waziri et al. [20] developed a Hager-Zhang-type conjugate gradient method aimed at solving monotone nonlinear systems of equations. Recently, Liu et al. [21] extended a classical three-term framework to a family of three-term conjugate gradient projection methods with a restart procedure and their inertial versions for constrained nonlinear pseudo-monotone equations. Ibrahim et al. [22] extended a spectral three-term conjugate gradient method framework for unconstrained optimization to two derivative-free methods.
1.2. Conjugate parameter design
We will discuss the design of the conjugate parameter in detail. For conjugate gradient methods in unconstrained optimization problems, Rivaie et al. [23] proposed a significant modification to the PRP-type conjugate parameter, with the modified expression as follows:
It is important to note that if the denominator ‖dk−1‖2 in the above expression is replaced by ‖Fk‖2, then the expression degenerates into the classical PRP-type conjugate parameter. To address the non-negativity issue of the conjugate parameter βRMILk, Dai et al. [24] introduced a modified RMIL conjugate parameter, specifically formulated as follows:
It is particularly noteworthy that if FTkFk−1 is negative or greater than or equal to ‖Fk‖2, then the conjugate parameter βRMIL+k becomes redundant, causing the search direction to degenerate into the classical steepest descent direction. Therefore, to ensure that the search direction exhibits both sufficient descent and trust region properties, we further refine (1.4) and design a revised RMIL conjugate parameter, i.e.,
which implies that the non-negativity holds, i.e.,
2.
Search direction properties and algorithm framework
In this section, we delve into the intricate aspects of the search direction properties, the line search technique, and the fundamental concept of the projection operator, all of which are critical components in the formulation and analysis of the proposed algorithm.
First, we begin by discussing the sufficient descent and trust-region properties, which are essential for ensuring the convergence and stability of the proposed algorithm. The sufficient descent property ensures that each iteration of the algorithm decreases the objective function by a significant amount. To this end, we multiply both sides of (1.3) by FTk on the left, yielding:
Incorporating the non-negatively condition from (1.6) and μ(‖Fk‖2+‖dk−1‖2)+‖dk−1‖2≥2μ‖Fk‖‖dk−1‖, we can further derive:
which clearly demonstrates that the sufficient descent property is satisfied. The trust-region property is another critical aspect that ensures the stability of the algorithm by bounding the search direction. By applying the Cauchy-Schwarz inequality to (2.1), we can derive the following inequality:
which implies that P1‖Fk‖≤‖dk‖ holds. In addition, from (1.3) and (1.6), we obtain:
Hence, the trust-region property is satisfied, i.e.,
Next, we turn our attention to the line search technique, a method used to determine the appropriate step size tk along the search direction dk. The step size is calculated by using the formula tk=ξρmk, where mk is the smallest non-negative integer m that satisfies the following condition:
Lastly, we define the projection operator, an essential tool in optimization, particularly in constrained problems. The projection operator ensures that the iteration point of the algorithm remain within the feasible region defined by the problem's constraints. The projection operator PC onto a convex set C is defined as follows:
where ‖x−y‖ denotes the Euclidean distance between x and y. To be specific, the projection operator possesses a well-known non-expansive property, which can be expressed as:
To sum up, we propose a modified RMIL conjugate gradient algorithm (abbreviated as the MRMIL algorithm). The procedure of this algorithm can be outlined and discussed as shown in Algorithm 1.
3.
Convergence analysis
In the following discussion, we consistently assume that the sequences {tk}, {dk}, {xk}, {wk}, and {Fk} are generated by the MRMIL algorithm. Additionally, we introduce the following general assumptions:
Assumption S:
(S1) The solution set C∗ of the problem (1.1) is non-empty.
(S2) The function F is monotone and continuous on Rn.
It is important to note that the continuity condition in Assumption S2 is less restrictive than the Lipschitz continuity, which is the more traditional assumption in algorithms designed to solve nonlinear systems of equations.
Lemma 1. The line search technique is well-defined. That is, for all k≥0, there exists a non-negative integer mk such that (2.3) holds.
Proof. Similar to Lemma 3.2 in [4], the proof can be completed straightforwardly. □
Lemma 2. The following conclusions hold:
(1) The sequence {‖xk−x∗‖} converges for any x∗∈C∗.
(2) The sequence {xk} is bounded.
(3) limk→∞tk‖dk‖=0.
Proof. Similar to Lemma 3.3 in [4], the proof can be completed easily. □
Theorem 1. The sequence {xk} globally converges to a solution of the problem (1.1), i.e.,
Proof. To establish the global convergence of the sequence {xk}, we consider two possible cases based on the behavior of ‖Fk‖ as k→∞.
Case 1 (limk→∞inf‖Fk‖=0): In this case, the continuity of F implies that the sequence {xk} has an accumulation point x∗ such that F(x∗)=0. Additionally, the convergence of {‖xk−x∗‖} further indicates that the sequence {xk} converges to x∗.
Case 2 (limk→∞inf‖Fk‖>0): Assume, for the sake of contradiction, that there exists a constant r>0 such that ‖Fk‖≥r for any k≥0. This assumption, together with (2.2), implies that ‖dk‖≥P1‖Fk‖≥P1r. Combining this and Lemma 2(3), it follows that limk→∞tk=0. Due to the boundary of the sequences {xk} and {dk}, there exists an infinite index set L such that
From the sufficient descent condition defined in (2.1), we have −F(xkl)Tdkl≥P1‖F(xkl)‖2 for any l∈L. Taking the limit as l→∞ in the above inequality, we obtain
Additionally, applying the line search technique defined in (2.3), we obtain −F(xkl+ρ−1tkldk1)Tdk1<σρ−1tkl‖F(xk1+ρ−1tkldk1)‖‖dk1‖2 for any l∈L. Taking the limit as l→∞ in this inequality, we obtain
The two inequalities are clearly contradictory. Therefore, the assumption that limk→∞inf‖Fk‖>0 must be false, which implies that limk→∞inf‖Fk‖=0. □
4.
Numerical experiments
In this section, we present the general experimental setups and conduct two distinct experiments: one focused on solving nonlinear systems of equations and the other on sparse signal restoration.
4.1. General setups
To validate the effectiveness of the MRMIL algorithm, we conduct a comparative analysis against other similar algorithms, including the modified fletcher-reeves method (MFRM) algorithm [25] and the hybrid three-term conjugate gradient projection (HTTCGP) algorithm [26]. This comparative study provides a robust framework for evaluating the performance of the MRMIL algorithm, particularly in large scale nonlinear systems of equations and sparse signal restoration problems. All experimental codes are executed on a system running the Ubuntu 20.04.2 LTS 64-bit operating system, equipped with an Intel(R) Xeon(R) Gold 5115 processor with a clock speed of 2.40 GHz. In the comparative analysis, the MRMIL algorithm is directly compared to the MFRM and HTTCGP algorithms. A key aspect of the analysis is the modification of step 10 in the MRMIL algorithm, where the search direction is calculated by using both the MFRM and HTTCGP algorithms. This modification allows us to isolate and assess the impact of different search direction strategies on the overall performance of the algorithm. Importantly, all other steps of the MRMIL algorithm remain unchanged during this comparison, ensuring that any differences in performance could be attributed to the search direction computation.
The parameters for the MRMIL algorithm are carefully selected to optimize its performance. Specifically, the parameters are set as follows: μ=2, σ=10−4, ρ=0.74, ξ=1, and ϵ=10−5. For the MFRM and HTTCGP algorithms, the parameters are maintained according to their default setting as specified in the original references.
4.2. Experiments on nonlinear systems of equations
To thoroughly evaluate the performance of the proposed algorithm, we selecte a set of benchmark problems with varying dimensions, where the number of variables ranges from [1000 5000 10,000 50,000 100,000]. The initial points for these benchmark problems are set as follows to provide diverse starting conditions, ensuring a comprehensive assessment of the algorithm: x1=[0,1]n, x2=(1−1n,1−2n,…,1−nn)T, x3=(13,132,…,13n)T, x4=(1n,2n,…,nn)T, x5=(1,12,…,1n)T, x6=(1,1,…,1)T, x7=(12,122,…,12n))T, x8=(0,1n,…,n−1n)T. The benchmark problems are formulated as F(x)=(F1(x),F2(x),…,Fn(x))T with x=(x1,x2,…,xn)T, as defined in Problems 1–8. Each algorithm is executed on these benchmark problems with a termination setting: ‖Fk‖≤ϵ or when the number of iterations exceeded 2000.
Problem 1. Let the function F(x) be defined as:
where the feasible region is C=Rn+.
Problem 2. Let the function F(x) be defined as:
where the feasible region is C=Rn+.
Problem 3. Let the function F(x) be defined as:
where the feasible region is C=[−2,+∞).
Problem 4. Let the function F(x) be defined as:
where the feasible region is C=[−1,+∞).
Problem 5. Let the function F(x) be defined as:
where the feasible region is C=Rn+.
Problem 6. Let the function F(x) be defined as:
where the feasible region is C=Rn+.
Problem 7. Let the function F(x) be defined as:
where the feasible region is C=Rn+.
Problem 8. Let the function F(x) be defined as:
where the feasible region is C=Rn+.
The performance of the MRMIL, MFRM, and HTTCGP algorithms is evaluated through a series of benchmark problems, with the results shown in Tables 1–8. In these tables, the term "Init(n)" denotes the initial points and the corresponding dimension multiplied by 1000. The results are presented in the format CPUT/NF/NIter/Norm, where "CPUT" denotes the CPU time in seconds, "NF" denotes the number of function evaluations, "NIter" denotes the number of iterations, and "Norm" denotes the norm of the function at the approximate optimal point. The tables demonstrate that all three algorithms effectively solved the benchmark problems across a range of initial points and dimensions. It is particularly noteworthy that the MRMIL algorithm consistently outperforms the others in most cases. To thoroughly analyze the performance of these three algorithms, we can employ the performance profiles methodology developed by Dolan and Moré [27], which is defined by the following fraction:
where Tp is the test set, |Tp| is the number of problems in the test set, Q is the set of solvers, and tp,q is the performance indicator (e.g., NIter, NF, and CPUT) for tp∈Tp and q∈Q. These profiles offer a comprehensive way for comparing algorithms, where a higher curve indicates better performance of the associated method. By plotting these performance profiles for the three algorithms in terms of CPUT, NF, and NIter, we can visually assess and compare their efficiency and effectiveness (see Figures 1–3). According to Figure 1, the MRMIL algorithm demonstrates remarkable efficiency, solving 75% of the benchmark problems with the least NIter. Furthermore, Figures 2 and 3 provide additional insights into the the algorithm's effectiveness: the MRMIL algorithm successfully solves approximately 72% (73%) of the benchmark problems with less NF (CPUT) compared to the MFRM and HTTCGP algorithms, which solve around 20% (19%) and 27% (21%) of the problems, respectively. These results underscore the superior efficiency of the proposed algorithm, highlighting its ability to optimize computational resources while maintaining high performance across a broad range of test cases.
4.3. Experiments on sparse signal restoration
To further illustrate the effectiveness of the MRMIL algorithm in addressing real-world challenges, we apply it to a sparse signal restoration problem [28] as a case study. The objective of this problem is to accurately restore a sparse signal, denoted as ˆa, from an observed signal ˜a, where both signals have a dimension of n. The quality of the signal restoration is assessed by the mean squared error (MSE), a widely accepted evaluation metric in signal processing. The MSE is defined as MSE=‖ˆa−˜a‖2/n, where ˆa represents the original signal, and ˜a represents the restored signal. In this experiment, the original signal is configured with a length of n=5120 and k=1280. Notably, these non-zero elements are randomly distributed across the signal, with a total of 160 such elements.
Figure 4 provides the results of the three algorithms for the sparse signal restoration problem, where the MRMIL algorithm clearly outperforms the others in terms of both NIter and CPUT. To account for the inherent randomness in the experiment, we conducted 10 independent trials for each algorithm, utilizing different random seeds in each trial. The average results from these experiments are shown in Table 9. The data clearly indicate that the MRMIL algorithm excels in restoring the original signal, surpassing both the MFRM and HTTCGP algorithms by requiring fewer iterations and less computational time. These findings not only validate the superior efficiency of the MRMIL algorithm but also highlight its robustness and consistency across multiple trials.
5.
Conclusions
In this paper, we presented a modified RMIL-type conjugate gradient algorithm specifically tailored to solve nonlinear systems of equations with convex constraints. The proposed algorithm incorporates several key innovations that enhance both its theoretical foundation and practical performance. A central feature of the algorithm is the modification of the conjugate parameter, ensuring its non-negativity. This adjustment plays a critical role in improving the overall stability of the algorithm by preventing negative conjugate parameters that could destabilize the iterative process. The proposed algorithm also guarantees sufficient descent and trust region properties in the search direction, eliminating the need for traditional line search techniques, which often add complexity and computational cost. We established the global convergence of the algorithm under general conditions, without relying on the commonly imposed assumption of Lipschitz continuity. Extensive numerical experiments were conducted to evaluate the performance of the proposed algorithm. The results demonstrate that it consistently outperforms existing algorithms, particularly in terms of computational efficiency. This is especially notable in large scale nonlinear systems of equations. Furthermore, the algorithm shows superior performance in signal recovery problems.
Use of AI tools declaration
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
Acknowledgments
This work is supported by the Natural Science Foundation in Guangxi Province, PR China (grant number 2024GXNSFAA010478) and the Guangzhou Huashang College Daoshi Project (grant number 2023HSDS38).
Conflict of interest
The authors declare there are no conflicts of interest.