Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Drought is one of the most natural hazards that cause damage to ecosystems, agricultural production, and water resources. This study has analyzed seasonal and annual rainfall trends using monthly data series of 33 years (1983–2015) in Addis Ababa city over three stations namely; Sendafa, Bole, and Observation. Here, we examined the occurrence of historical drought trends in the study jurisdiction. The Reconnaissance Drought Index (RDI) and the Standardized Precipitation Index (SPI) were employed to find long-term drought trends as well as to examine the occurrence of drought history at a longer duration. The analysis indicated that severe drought conditions were observed for SPI and RDI indices in the year 2013 for Bole station, while medium droughts were recorded for the years 1991 and 2002 for all stations. Similarly, the RDI indices for 1996 was recorded as severe drought for the Observatory station. On the other hand, higher variability (coefficient of variation) of rainfall during winter seasons were 95.8%, 95.9%, and 77.9% for Sendafa, Bole, and Observatory stations respectively. However, the lower coefficient of variation during annual rainfall was 15.59% for Sendafa, 14.38% for Bole, and 13.98% for the Observatory station. Furthermore, the drought severity classification for the long-term drought analysis of annual precipitation shows that 3% of severe drought, 12% of moderate drought, and 85% of the normal condition were recorded in Bole station. The severe and moderate drought indices due to the reduction of rainfall, temperature change, and other factors can cause a shortage of urban water supply. Thus, the results of this study will help the water sector professionals in forecasting weather variations and for better management of urban water resources.
Citation: Zinabu A. Alemu, Emmanuel C. Dioha, Michael O. Dioha. Hydro-meteorological drought in Addis Ababa: A characterization study[J]. AIMS Environmental Science, 2021, 8(2): 148-168. doi: 10.3934/environsci.2021011
[1] | Predrag S. Stanimirović, Branislav Ivanov, Haifeng Ma, Dijana Mosić . A survey of gradient methods for solving nonlinear optimization. Electronic Research Archive, 2020, 28(4): 1573-1624. doi: 10.3934/era.2020115 |
[2] | Yan Xia, Songhua Wang . Global convergence in a modified RMIL-type conjugate gradient algorithm for nonlinear systems of equations and signal recovery. Electronic Research Archive, 2024, 32(11): 6153-6174. doi: 10.3934/era.2024286 |
[3] | Gonglin Yuan, Minjie Huang . An efficient modified HS conjugate gradient algorithm in machine learning. Electronic Research Archive, 2024, 32(11): 6175-6199. doi: 10.3934/era.2024287 |
[4] | Youjun Deng, Hongyu Liu, Xianchao Wang, Dong Wei, Liyan Zhu . Simultaneous recovery of surface heat flux and thickness of a solid structure by ultrasonic measurements. Electronic Research Archive, 2021, 29(5): 3081-3096. doi: 10.3934/era.2021027 |
[5] | Yue Feng, Yujie Liu, Ruishu Wang, Shangyou Zhang . A conforming discontinuous Galerkin finite element method on rectangular partitions. Electronic Research Archive, 2021, 29(3): 2375-2389. doi: 10.3934/era.2020120 |
[6] | Jiaxin Zhang, Hoang Tran, Guannan Zhang . Accelerating reinforcement learning with a Directional-Gaussian-Smoothing evolution strategy. Electronic Research Archive, 2021, 29(6): 4119-4135. doi: 10.3934/era.2021075 |
[7] | Sida Lin, Lixia Meng, Jinlong Yuan, Changzhi Wu, An Li, Chongyang Liu, Jun Xie . Sequential adaptive switching time optimization technique for maximum hands-off control problems. Electronic Research Archive, 2024, 32(4): 2229-2250. doi: 10.3934/era.2024101 |
[8] | Simon Eberle, Arnulf Jentzen, Adrian Riekert, Georg S. Weiss . Existence, uniqueness, and convergence rates for gradient flows in the training of artificial neural networks with ReLU activation. Electronic Research Archive, 2023, 31(5): 2519-2554. doi: 10.3934/era.2023128 |
[9] | Cheng Wang . Convergence analysis of Fourier pseudo-spectral schemes for three-dimensional incompressible Navier-Stokes equations. Electronic Research Archive, 2021, 29(5): 2915-2944. doi: 10.3934/era.2021019 |
[10] | Yineng Ouyang, Zhaotao Liang, Zhihui Ma, Lei Wang, Zhaohua Gong, Jun Xie, Kuikui Gao . A class of constrained optimal control problems arising in an immunotherapy cancer remission process. Electronic Research Archive, 2024, 32(10): 5868-5888. doi: 10.3934/era.2024271 |
Drought is one of the most natural hazards that cause damage to ecosystems, agricultural production, and water resources. This study has analyzed seasonal and annual rainfall trends using monthly data series of 33 years (1983–2015) in Addis Ababa city over three stations namely; Sendafa, Bole, and Observation. Here, we examined the occurrence of historical drought trends in the study jurisdiction. The Reconnaissance Drought Index (RDI) and the Standardized Precipitation Index (SPI) were employed to find long-term drought trends as well as to examine the occurrence of drought history at a longer duration. The analysis indicated that severe drought conditions were observed for SPI and RDI indices in the year 2013 for Bole station, while medium droughts were recorded for the years 1991 and 2002 for all stations. Similarly, the RDI indices for 1996 was recorded as severe drought for the Observatory station. On the other hand, higher variability (coefficient of variation) of rainfall during winter seasons were 95.8%, 95.9%, and 77.9% for Sendafa, Bole, and Observatory stations respectively. However, the lower coefficient of variation during annual rainfall was 15.59% for Sendafa, 14.38% for Bole, and 13.98% for the Observatory station. Furthermore, the drought severity classification for the long-term drought analysis of annual precipitation shows that 3% of severe drought, 12% of moderate drought, and 85% of the normal condition were recorded in Bole station. The severe and moderate drought indices due to the reduction of rainfall, temperature change, and other factors can cause a shortage of urban water supply. Thus, the results of this study will help the water sector professionals in forecasting weather variations and for better management of urban water resources.
In this survey, we focus on solving the unconstrained nonlinear optimization problem
minf(x), x∈Rn, | (1.1) |
where the function
xk+1=xk+αkdk, k≥0, | (1.2) |
where the step-size
The following notations will be used, as usual:
g(x)=∇f(x), G(x)=∇2f(x), gk=∇f(xk), Gk=∇2f(xk), |
where
f(xk+1)≈f(xk)+αkgTkdk. | (1.3) |
Therefore, an appropriate descent search direction
gTkdk<0,for allk. | (1.4) |
Primary choice for descent direction is
xk+1=xk−αkgk. | (1.5) |
In this paper, we survey gradient methods satisfying the descent condition (1.4). If there exists a constant
gTkdk<−c‖gk‖2,for allk, | (1.6) |
where
As for the choice of search direction, one of the possible choices for the search direction in unconditional optimization is to move from the current point along the negative gradient in each iteration, which correspond to
Advantages and disadvantages of GD methods can be summarized as follows.
Another important direction of the search is the Newton's direction
Φ(d):=f(xk+d)≈f(xk)+gTkd+12dTGkd. | (1.7) |
The solution
dk=−G−1kgk. |
So, the pure Newton method is defined by
xk+1=xk−G−1kgk. | (1.8) |
The Newton method with line search uses an appropriate step-size
xk+1=xk−αkG−1kgk, | (1.9) |
wherein the step-size
The Newton method exhibits three major drawbacks in practical applications.
Due to that, numerous modifications of it were created, which can be globally divided into two large groups: modified Newton's methods and quasi-Newton (QN) methods. The QN methods are aimed to address all the above difficulties of the Newton method. The first drawback is overcome by taking an appropriately defined positive definite matrix
QN methods and modified Newton methods belong to most powerful methods for solving unconditional optimization and applicable in many nonlinear optimization problems. A survey of QN methods for solving nonlinear least-squares problems was considered in [86]. The optimization methods have found a number of applications in fluid mechanics [44], free surface flow and solid body contact [13], finding the optimal trajectory for an aircraft or a robot arm, designing a portfolio of investments, controlling a chemical process, computing the optimal shape of an automobile. See [90] for more details. A modification of the quasi-Newton method in defining the two-phase approximate greatest descent was used in [71]. Several variants of multi-step spectral gradient methods for solving large scale unconstrained optimization problems were proposed in [111]. Usage of an optimization algorithm in artificial neural networks was considered in [75]. Properties of Hessian matrix which appear in distributed gradient-based multi-agent control systems was considered in [117]. A survey of derivative-free optimization methods was given in [127]. An application of unconstrained optimization in solving the risk probability was presented in [76].
The study of conjugate gradient (CG) methods was started by Hestenes and Stiefel in 1952 in [60], and the development of CG methods for solving large-scale unconstrained optimization problems is still ongoing. After all these years, there is still a need to find a more efficient CG method that will solve unconstrained optimization problems with thousands of variables in the shortest possible time interval as well as with a minimal number of iterations and function evaluations.
CG methods construct a sequence of approximations
dk:=dk:=d(βk,gk,dk−1)={−g0,k=0,−gk+βkdk−1,k≥1, | (1.10) |
where
Popularity of CG methods is confirmed by a number of recent surveys and book chapters [42,57,85,87,88]. In addition to this basic information on the chronological development of the CG methods, it is also important to mention its applications. In general, CG methods are important in solving large-scale optimization problems. CG methods iterates are characterized by low memory allocation and strong local and global convergence properties. Based on this fact, these methods become useful in all areas where optimization problems of any kind are involved. The CG methods have wide use in solving systems of equations and image restoration problem [12,16,70,78,128,135,136,140], the linear response eigenvalue problem [74], in regression analysis [108,143]. On that way, CG methods have the influence on the development of an artificial neural networks learning algorithms [46,75]. A unique approach to the ABS type CG methods was proposed in [1]. Application of CG methods in solving very large symmetric positive semi definite linear systems that appear in optimal surface parameterizations are described in [64]. Also, it is possible to mention application in data analysis [110]. A variant of the projected preconditioned conjugate gradient method and its application in solving the linear response eigenvalue problem was investigated in [74].
Main goals leading current research paper can be highlighted as follows.
The overall structure of the paper based on contents of each section is described as follows. Section 1 describes basic notation, introductory notions, preliminaries and motivation. Global algorithms and various line search variants are presented in Section 2. Overview of QN methods and their classification are considered in Section 3. Section 4 gives a specific overview of CG methods. Convergence properties of considered CG methods are investigated in Section 5. According to the presented taxonomy of basic CG methods, properties of CG methods with
First, we present an algorithm that describes the general scheme of line search methods
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
To achieve the global convergence of iterative methods, an appropriate step-size
Φ(α):=f(xk+αdk) | (2.1) |
and the step-size is defined after the unidimensional optimization of the form
f(xk+αkdk)=minα>0Φ(α). | (2.2) |
The ELS rule may give the most precise minimum. However, ELS is too expensive in practice or even impossible to implement, especially in situations where the iteration is far from the exact solution.
Applying the iterative procedure (1.2), it is most logical to choose a new point so that the step length
Φ(αk)=f(xk+1)<Φ(0)=f(xk). | (2.3) |
Methods that in each iterative step require the fulfillment of conditions (2.3), that is, the reduction of the value of the objective function, define iterations that in each step approach the minimum of the given function. The methods conceived in this way belong to the class of methods of monotone line search. Many variants of inexact line search (ILS) rules are proposed and dominant in the nonlinear optimization. The most commonly used ILS techniques are Armijo, Goldstein, Wolfe, Powel, Fletcher and other [4,8,19,50,55,56,109,126]. In most conjugate gradient methods, one of the next ILS procedures methods is used to calculate the step length
In contrast to the monotonic line search, the non-monotonic line search is also known in the literature, where it is not necessary to reduce the value of the objective function in each iteration [51,52,53]. Although non-monotone techniques do not provide a minimum approach to the function in each iteration, they are very common in practical applications and have very good convergence properties. A number of nonmonotonic linear search methods have been proposed recently (see, for example, [119,120]).
A backtracking line search scheme based on the Armijo condition, is aimed to determining the maximal value during the moving along a given search vector. It starts with a relatively large step-size estimate and iteratively reduces the step-size value until a decrease in the value of the objective function is observed, according to the local gradient of the goal function. Let
f(xk+βmktdk)≤f(xk)+φβmktgTkdk, t>0. | (2.4) |
The procedure for backtracking line search proposed in [4] starts from the initial value
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
In order to ensure a sufficient decrease of the objective function, Goldstein rule for ILS requires the following conditions:
f(xk+αdk)≤f(xk)+ρtgTkdk | (2.5) |
and
f(xk+αdk)≥f(xk)+(1−ρ)tgTkdk, | (2.6) |
where
Wolfe line search conditions are well-known and these are given by
f(xk+αkdk)≤f(xk)+ηαkgTkdk, | (2.7) |
g(xk+αkdk)Tdk≥σ1gTkdk, | (2.8) |
where
−σ2gTkdk≥g(xk+αkdk)Tdk≥σ1gTkdk | (2.9) |
are often used, where
|g(xk+αkdk)Tdk|≤−σ1gTkdk. | (2.10) |
The condition (2.7) of the Wolfe conditions is called the Armijo condition, which is often used apart or in the form of its variants.
The most general iterative rule of QN type with line search is of the form
xk+1=xk−αkHkgk, | (3.1) |
such that
sk=xk+1−xk,yk=gk+1−gk | (3.2) |
are typical. The update
Bk+1=Bk+Ek, | (3.3) |
where
Bk+1sk=yk. | (3.4) |
The quasi-Newton condition for the matrix
Hk+1yk=sk. | (3.5) |
Methods that require the calculation or approximation of the Hessian matrix and its inverse belong to the class of QN methods as well as its numerous modifications. The pure Newton method requires calculation of second derivatives matrix, which is avoided in QN methods. As a consequence, the Newton method is computationally expensive and exhibits slow computation, while QN methods are computationally cheap and of faster computation. On the other hand, the Newton method requires lesser number of iterative steps and generates more precise convergence path than QN methods.
The Symmetric Rank-One update (SR1) assumes the matrix
Hk+1=Hk+Ek, |
where
Hk+1=Hk+ukvTk, | (3.6) |
where
uk,vk∈Rn. |
The quasi-Newton condition (3.5) initiates
Hk+1yk=(Hk+ukvTk)yk=sk, |
that is
(vTkyk)uk=sk−Hkyk. | (3.7) |
The conclusion is that
sk−Hkyk≠0, |
(otherwise
Hk+1=Hk+1vTkyk(sk−Hkyk)vTk. | (3.8) |
The condition that the approximation
Hk+1=Hk+(sk−Hkyk)(sk−Hkyk)T(sk−Hkyk)Tyk, | (3.9) |
which is the Symmetric Rank One (SR1) update.
For general functions, Conn, Gould and Toint in [20] proved that the sequence of SR1 Hessian approximations converges to the true Hessian provided that the steps are uniformly linearly independent.
The Hessian update
minB‖B−Bk‖, s.t. B=BT, Bsk=yk. | (3.10) |
The solution to (3.10) is equal to
BDFPk+1=(I−ρkyksTk)BDFPk(I−ρkskyTk)+ρkykyTk, ϱk=1yTksk. |
The inverse Hessian update can be generated using the Sherman - Morrison - Woodbury.
Moreover, the DFP update is known as a method of updating, of rank
Hk+1=Hk+aukuTk+bvkvTk, |
where
Hkyk+aukuTkyk+bvkvTkyk=sk. | (3.11) |
The vectors
uk=sk,vk=Hkyk. |
Now, (3.11) implies
a=1uTkyk=1sTkyk,b=−1vTkyk=−1yTkHkyk. |
So we get
HDFPk+1=Hk+sksTksTkyk−HkykyTkHkyTkHkyk. | (3.12) |
Formula (3.12) was proposed by Davidon and later was developed by Fletcher and Powell, so that it is called DFP update.
One famous broadly used updating formula is the Broyden-Fletcher-Goldfarb–Shanno (BFGS) rule. The inverse Hessian update
minH‖H−Hk‖, s.t. H=HT, Hyk=sk. | (3.13) |
Certainly, the BFGS update is overtly known as
HBFGSk+1=(I−ρkskyTk)HBFGSk(I−ρkyksTk)+ρksksTk, ρk=1yTksk=HBFGSk−HkyksTk+skyTkHksTkyk+(1+yTkHkyksTkyk)sksTksTkyk, | (3.14) |
where
The update BFGS formula for the Hessian matrix can be generated using the Sherman-Morrison-Woodbury formula. A rank-one-modification (or perturbation)
M−1=A−1−(1+c∗A−1b)−1A−1bc∗A−1. | (3.15) |
As a result, the following update for
BBFGSk+1=BBFGSk−BksksTkBksTkBksk+ykyTksTkyk. | (3.16) |
The weighted combinations of DFP and BFGS updates give the whole update class, which is known as the Broyden class. This class of update is defined by
Hϕk+1=(1−ϕ)HDFPk+1+ϕHBFGSk+1, | (3.17) |
where
Hϕk+1=HDFPk+1+ϕυkυTk=HBFGSk+1+(ϕ−1)υkυTk=Hk+sksTksTkyk−HkykyTkHkyTkHkyk+ϕυkυTk, | (3.18) |
where
υk=(yTkHkyk)1/2[sksTkyk−HkykyTkHkyk]. | (3.19) |
If we put in (3.18):
The Broyden class of methods can be derived directly from the quasi-Newton equation. Consider the general formula for updating rank 2, which contains
Hk+1=Hk+asksTk+b(HkyksTk+skyTkHk)+cHky′kyTkHk, | (3.20) |
where
1=asTkyk+byTkHkyk,0=1+bsTkyk+cyTkHkyk. | (3.21) |
Here we have two equations with three unknowns, so we can introduce the replacement
b=−ϕ/sTkyk, |
where
Hϕk+1=Hk+sksTksTkyk−HkykyTkHkyTkHkyk+ϕυkυTk=HDFPk+1+ϕυkυTk, |
where
A great effort has been invested to discover QN methods that do not merely possess convergence but it is also better from the BFGS update in the numerical performance. Table 1 shows some of these modifications of the quasi-Newton equations.
Quasi-Newton Eqs. | Ref. | |
[104] | ||
[72] | ||
[123] | ||
[133] | ||
[134] | ||
[59] |
Also, it is important to state spectral gradient method. Therein the updating of the formula for
Bk+1=diag(r(i)k) | (3.22) |
with
r(i)k=11+∑ni=1(ˆs(i)k)2−∑ni=1(ˆs(i)k)(ˆy(i)k)∑ni=1(ˆs(i)k)4(ˆs(i)k)2, ˆsk=sk−711sk−1+211sk−2, |
ˆyk=yk−711yk−1+211yk−2. |
The general framework of the QN algorithm is given in Algorithm 3.
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
The general QN iterative rule with line search
xk+1=xk−αkHkgk | (3.24) |
assumes that
According to Brezinski's classification in [15], the structure of updating
Bk=γkI≈Gk, γk>0, | (3.25) |
where
xk+1=xk−γ−1kαkgk. | (3.26) |
Usually, the parameter
Andrei in [4,6] defined iterations
xk+1=xk−θkαkgk. | (3.27) |
Usage of random values of
xAGDk+1=xAGDk−θAGDkαkgAGDk. | (3.28) |
A few modifications of the scheme (3.26) were promoted in [94,96,97,112,114]. The iterations defined in [112] are of the form (3.26), in which
xSMk+1=xSMk−αk(γSMk)−1gSMk, | (3.29) |
where
γSMk+1=2γSMkγSMk[f(xSMk+1)−f(xSMk)]+αk‖gSMk‖2α2k‖gSMk‖2. | (3.30) |
The Double direction and double step-size accelerated methods, termed as ADSS and ADD, respectively, were originated in [94,96].
The next iterations are known as Accelerated double step-size (ADSS) iterations [94]:
xADSSk+1=xADSSk−(αk(γADSSk)−1+lk)gADSSk, | (3.31) |
where
xTADSSk+1=xTADSSk−ψkgTADSSk, |
where
The particular choice
xk+1=xk−γ−1kgk, | (3.32) |
where
γBBk=sTk−1yk−1yTk−1yk−1. | (3.33) |
The symmetric case assumes the minimization
γBBk=sTk−1sk−1sTk−1yk−1. | (3.34) |
The BB iterations are defined using
xBBk+1=xBBk−γBBkgBBk. |
The BB method was improved in a number of research articles, main of which are [22,24,34,35,36,37,106,107,122,137].
Another member of the IGD iterates is the Scalar Correction (SC) method [84], defined in (3.32) by the rule
γSCk+1={sTkrkyTkrk,yTkrk>0,‖sk‖‖yk‖,yTkrk≤0, rk=sk−γkyk. | (3.35) |
Accordingly, the SC iterations are defined by the relation
xSCk+1=xSCk−γSCkgSCk. |
Relaxed BB method by an additional step
A modification of GD method (1.5) was proposed in [63]. It is defined by
xk+1=M(GD)(xk)=xk−(αk+α2k−α3k)gk. | (3.36) |
Further, the next scheme was proposed as the modified SM (MSM) method in [63]:
xMSMk+1=xMSMk−(αk+α2k−α3k)(γMSMk)−1gMSMk. | (3.37) |
The leading principle used in defining the iterations (3.37) is the replacement of
αk≤αk+α2k−α3k≤αk+α2k. |
So, (3.37) is based on a small increase of
The acceleration parameter
γADDk+1=2f(xADDk+1)−f(xADDk)−αk(gADDk)T(αkdADDk−(γADDk)−1gADDk)(αkdADDk−(γADDk)−1gADDk)T(αkdADDk−(γADDk)−1gADDk),(ADD method[96])γADSSk+1=2f(xADSSk+1)−f(xADSSk)+(αk(γADSSk)−1+lk)‖gADSSk‖2(αk(γADSSk)−1+lk)2‖gADSSk‖2,(ADSS method[94])γTADSSk+1=2f(xTADSSk+1)−f(xTADSSk)+ψk‖gTADSSk‖2ψ2k‖gTADSSk‖2,ψk=αk((γTADSSk)−1−1)+1,(TADSS method[114])γMSMk+1=2γMSMkγMSMk[f(xMSMk+1)−f(xMSMk)]+(αk+α2k−α3k)‖gMSMk‖2(αk+α2k−α3k)2‖gMSMk‖2,(MSM method[63]). |
The efficiency of IGD methods was numerically tested in [98].
The author of [41] proposed two Relaxed Gradient Descent Quasi Newton (RGDQN and RGDQN1) iteration rules
xk+1=xk−ξkαkγ−1kgk, | (3.38) |
such that
ξk=γkαkγk+1. |
The following algorithm is known as the SM method and introduced in [112].
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
An application of the Picard-Mann hybrid iterative process from [66] is another possibility to accelerate iterations for solving nonlinear optimization problems. The function
{x1=x∈C,xk+1=Tyk,yk=(1−Υk)xk+ΥkTxk, k∈N. | (3.39) |
The real number
xk+1=H(T)(xk)=T[(1−Υk)xk+ΥkTxk], k∈N. | (3.40) |
The iterates (3.40) are denoted by
In [66] it was proposed a set of constant values
T(xk):=xSMk−(γSMk)−1αkgSMk, |
the iterations (3.40) become the so called HSM iterative rule given as
xHSMk+1:=H(SM)(xk)=xHSMk−(Υk+1)(γHSMk)−1αkgHSMk, | (3.41) |
where
γHSMk+1=2γHSMkγHSMk[f(xHSMk+1)−f(xHSMk)]+(Υk+1)tk‖gHSMk‖2(Υk+1)2t2k‖gHSMk‖2. |
A modified HSM (MHSM) method is defined in [92] by proposing an appropriate initial value in the backtracking procedure.
A hybridization of the ADD method was considered in [99] in the form
xHADDk+1=xHADDk−(Υk+1)tk(γHADDk)−1gHADDk+(Υk+1)t2kdk, |
wherein
γHADDk+1=2f(xHADDk+1)−f(xHADDk)−(Υk+1)(gHADDk)T(t2kdk−tk(γHADDk)−1gHADDk)(Υk+1)2t2k(tkdk−(γHADDk)−1gHADDk)T(tkdk−(γHADDk)−1gHADDk). |
Recently, the hybridization
Nonlinear conjugate gradient (CG) methods form a class of important methods for solving unconstrained nonlinear optimization and solving system of nonlinear equations. Nonlinear CG methods are defined by the line search iterates (1.2) where the search direction
In this article, a review on CG methods for unconstrained optimization is given. Main convergence theorems are provided for the conjugate gradient method assuming the descent property of each search direction. Some research issues on conjugate gradient methods are mentioned.
In [21], the nonlinear CG methods are divided into three classes: early conjugate gradient methods, descent conjugate gradient methods, and sufficient descent conjugate gradient methods. Andrei classified the CG methods in three classes: classical CG methods, scaled CG methods, and hybrid and parameterized CG methods.
The classification presented in this paper divides CG methods into the following categories: basic conjugate gradients methods, considered in Subsection 4.1, Dai-Liao class of methods, presented in Subsection 4.2, hybrid conjugate gradient methods, described in Subsection 4.3, and combined BFGS-CG iterations, in Subsection 4.4.
The CG methods included in Table 2 are known as early or classical conjugate gradient methods.
Title | Year | Reference | |
1952 | [60] | ||
1964 | [48] | ||
1967 | [38] | ||
1969 | [102,103] | ||
1987 | [47] | ||
1991 | [79] | ||
1999 | [30] |
where
In the listed CG methods, the numerator of the update parameter
Denominator | |||
Numerator | |||
FR | DY | CD | |
PRP | HS | LS |
Define the following functions
n1:=‖gk‖2,n2=yTk−1gk,d1:=‖gk−1‖2,d2=yTk−1dk−1,d3=−gTk−1dk−1. |
Then
βFRk=n1d1, βPRPk=n2d1, βDYk=n1d2, βHSk=n2d2, βCDk=n1d3, βLSk=n2d3. |
But, there exist exceptions to these rules. One example is given in [38]
βDk=gTkGk−1dk−1dTk−1Gk−1dk−1(1967). |
From the presented chronological development of the CGUP, we can see that the
For a large-scale unconstrained nonlinear optimization problem, in practice, choices for updating a CG parameter that do not require computation or approximation of the Hessian and its inverse are preferred over methods that require Hessian or its approximation in each iteration.
Wei et al. [124] gave a variant of the PRP method which we call the VPRP method, with the parameter
βVPRPk=‖gk‖2−‖gk ‖‖gk−1 ‖gTkgk−1‖gk−1‖2. |
The VPRP method was extended to a variant of the HS method by Yao et al. in [132],
βVHSk=‖gk‖2−‖gk ‖‖gk−1 ‖gTkgk−1dTk−1yk−1. |
Zhang [138] took a little modification to the VPRP method and constructed the NPRP method as follows,
βNPRPk=‖gk‖2−‖gk ‖‖gk−1 ‖|gTkgk−1|‖gk−1‖2. |
Moreover, Zhang [138] extended this result to the HS method and proposed the NHS method as follows,
βNHSk=‖gk‖2−‖gk ‖‖gk−1 ‖|gTkgk−1|dTk−1yk−1. |
Recently, Wei et al. [125] proposed a variation of the FR method which we call the VFR method. the parameter
βVFRk=μ1‖gk‖2μ2|gTkdk−1|+μ3‖gk−1‖2, |
where
βDPRPk=‖gk‖2−‖gk ‖‖gk−1 ‖|gTkgk−1|μ|gTkdk−1|+‖gk−1‖2, μ>1. |
Recently, Wei et al. [18] gave a variant of the PRP method which we call the WYL method, that is,
βWYLk=gTk(gk−‖gk‖‖gk−1‖gk−1)‖gk−1‖2. |
The WYL method was extended to a variant of the LS method by Yao et al. [132], that is,
βMLSk=gTk(gk−‖gk‖‖gk−1‖gk−1)−gTkdk−1. |
Also, the following function will be useful:
N1=gTk(gk−‖gk‖‖gk−1‖gk−1)=n1−‖gk‖‖gk−1‖gTkgk−1=gTk^yk−1N2=n1−‖gk‖‖gk−1‖|gTkgk−1|. |
Then
βWYLk=N1d1,βVPRPk=N1d1,βVHSk=N1d2βNPRPk=N2d1,βNHSk=N2d2. |
Some particular CG variants are
βDHSk=‖gk‖2−‖gk ‖‖gk−1 ‖|gTkgk−1|μ|gTkdk−1|+dTk−1yk−1 (2012)βDLSk=‖gk‖2−‖gk ‖‖gk−1 ‖|gTkgk−1|μ|gTkdk−1|−dTk−1gk−1, μ>1 (2017). | (4.1) |
If the functions
D1(μ)=μ|gTkdk−1|+dTk−1yk−1,D2(μ)=μ|gTkdk−1|−dTk−1gk−1 |
are defined, then
βDHSk=N2D1(μ),βDLSk=N2D2(μ). |
An extension of the conjugacy condition
dTkyk−1=0 | (4.2) |
was studied by Perry [93]. Perry tried to incorporate the second-order information of the objective function into the CG method to accelerate it. Specifically, by using the secant condition and the search direction of the QN methods, which are respectively defined by
Bksk−1=yk−1andBkdk=−gk, | (4.3) |
the following relation is obtained:
dTkyk−1=dTk(Bksk−1)=(Bkdk)Tsk−1=−gTksk−1, | (4.4) |
where
dTkyk−1=−gTksk−1. | (4.5) |
Furthermore, Dai and Liao [26] included a nonnegative parameter
dTkyk−1=−tgTksk−1. | (4.6) |
In order to find the search direction
βDLk=gTkyk−1dTk−1yk−1−tgTksk−1dTk−1yk−1=gTk(yk−1−tsk−1)dTk−1yk−1, t>0. | (4.7) |
Expression (4.7) for defining
βDLk=yTk−1gkyTk−1dk−1−tgTksk−1dTk−1yk−1=βHSk−tgTksk−1dTk−1yk−1, (2001) | (4.8) |
βDHSDLk=‖gk‖2−‖gk ‖‖gk−1 ‖|gTkgk−1|μ|gTkdk−1|+dTk−1yk−1−tgTksk−1dTk−1yk−1=βDHSk−tgTksk−1dTk−1yk−1, (2017) | (4.9) |
\begin{eqnarray} \beta_k^{\mathrm{DLSDL}}\!\!\!\!\!& = &\!\!\!\!\!\frac{{\left\|{\mathbf g}_k\right\|}^2-\frac{\left\|{\mathbf g}_k\ \right\|}{\left\|{\mathbf g}_{k-1}\ \right\|}\left|{\mathbf g}_k^ \mathrm{T} {\mathbf g}_{k-1}\right|}{\mu \left|{\mathbf g}_k^ \mathrm{T} {\mathbf d}_{k-1}\right|-{\mathbf d}_{k-1}^ \mathrm{T} {\mathbf g}_{k-1}}-t\frac{\mathbf{g}_k^ \mathrm{T} {\mathbf s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} {\mathbf y}_{k-1}} \\ \!& = &\!\beta_k^{\mathrm{DLS}}-t\frac{\mathbf{g}_k^ \mathrm{T} {\mathbf s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} {\mathbf y}_{k-1}},\ (2017) \end{eqnarray} | (4.10) |
\begin{eqnarray} \beta_k^{\mathrm{MHSDL}}\!\!\!\!\!& = &\!\!\!\!\!\frac{{\left\|{\mathbf g}_k\right\|}^2\!-\!\frac{\left\|{\mathbf g}_k\ \right\|}{\left\|{\mathbf g}_{k-1}\ \right\|}{\mathbf g}_k^ \mathrm{T} {\mathbf g}_{k-1}} {{\mathbf d}_{k-1}^ \mathrm{T} {\mathbf y}_{k-1}}\!-\!t\frac{\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}} \\ \!& = &\!\frac{\mathbf{g}_k^ \mathrm{T} \widehat{\mathbf{y}_{k-1}}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}\!-\!t\frac{\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}},\ (2013)\ \end{eqnarray} | (4.11) |
where
Clearly
Some additional CG methods from the DL class are
\begin{equation} \beta_k^{\mathrm{MLSDL}} = \frac{\mathbf{g}_k^ \mathrm{T} \widehat{\mathbf{y}_{k-1}}}{-\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{g}_{k-1}}-t\frac{\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}} \end{equation} | (4.12) |
\begin{equation} \beta_k^{\mathrm{ZZDL}}\! = \!\frac{\mathbf{g}_k^ \mathrm{T} \mathbf{z}_{k-1}}{\mathbf{z}_{k-1}^ \mathrm{T} \mathbf{d}_{k-1}} -t\frac{\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{z}_{k-1}}, \end{equation} | (4.13) |
where
In order to characterize this family of CG methods, define the mapping
\begin{aligned} \mathfrak{F}\left(\beta_k^{\mathcal{M}},t\right)& = \beta_k^{\mathcal{M}}-t\frac{\mathbf{g}_k^ \mathrm{T} {\mathbf s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} {\mathbf y}_{k-1}}. \end{aligned} |
Then
\beta_k^{\mathrm{DL}} = \mathfrak{F}\left(\beta_k^{\mathrm{HS}},t\right),\quad \beta_k^{\mathrm{DHSDL}} = \mathfrak{F}\left(\beta_k^{\mathrm{DHS}},t\right),\quad |
\beta_k^{\mathrm{DLSDL}} = \mathfrak{F}\left(\beta_k^{\mathrm{DLS}},t\right),\quad \beta_k^{\mathrm{MHSDL}} = \mathfrak{F}\left(\beta_k^{\mathrm{DHS}},t\right). |
The research for the best values of the parameter
\begin{equation*} \label{Equtk1} t_{k} = 2\frac{\|\mathbf{y}_{k-1}\|^2}{\mathbf{y}_{k-1}^ \mathrm{T} \mathbf{s}_{k-1}}. \end{equation*} |
Babaie-Kafaki and Ghanbari [9] presented two appropriate choices of the parameter
\begin{equation*} \label{Equtk3} t_{k} = \frac{\mathbf{s}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}{\|\mathbf{s}_{k-1}\|^2} + \frac{\|\mathbf{y}_{k-1}\|}{\|\mathbf{s}_{k-1}\|} \end{equation*} |
and
\begin{equation*} \label{Equtk4} t_{k} = \frac{\|\mathbf{y}_{k-1}\|}{\|\mathbf{s}_{k-1}\|}. \end{equation*} |
Andrei in [5] suggested the following value for t in (4.7) which becomes a variant of the DL method, denoted by DLE:
\begin{equation} t_{k} = \frac{\mathbf{s}_{k-1}^T \mathbf{y}_{k-1}}{\|\mathbf{s}_{k-1}\|^2}. \end{equation} | (4.14) |
Lotfi and Hosseini in [81] discovered the most recent approximations of the parameter
In the subsequent sections, we will survey recent advances in CG methods. Two main research streams can be observed: the algorithms which improve the scalar parameter
Hybrid CG methods can be segmented into two classes: mixed methods as well as methods combined together by introducing one or more parameters.
The following hybrid CG method was suggested in [121]:
\begin{equation} \beta_{k}\! = \!\left\{ \begin{array}{ll} \beta_k^{\mathrm{PRP}}, & \text{if} \,\,\, 0 \leq \beta_k^{\mathrm{PRP}} \leq \beta_k^{\mathrm{FR}},\\ \beta_k^{\mathrm{FR}}, & \text{otherwise}. \end{array} \right. \end{equation} | (4.15) |
When a jam in iterations occurs again, the PRP update parameter is used. Hu and Storey in [61] had a similar motivation and suggested the following rule
\begin{equation} \beta_k = \max \{ 0, \min \{ \beta_k^{\mathrm{PRP}}, \beta_k^{\mathrm{FR}} \} \}. \end{equation} | (4.16) |
In [49] it is pointed out that
\begin{equation} \beta_k = \max \{ -\beta_k^{\mathrm{FR}}, \min \{ \beta_k^{\mathrm{PRP}}, \beta_k^{\mathrm{FR}} \} \}. \end{equation} | (4.17) |
Dai and Yuan [31] combined the DY method with other CG methods, which leads to the following CGUP parameters:
\begin{equation} \beta_k = \max \{ 0, \min \{ \beta_k^{\mathrm{HS}}, \beta_k^{\mathrm{DY}} \} \} \end{equation} | (4.18) |
and
\begin{equation} \beta_k = \max \{ -c\, \beta_k^{\mathrm{DY}}, \min \{ \beta_k^{\mathrm{HS}}, \beta_k^{\mathrm{DY}} \} \},\ \ c = \frac{1-\sigma}{1+\sigma}. \end{equation} | (4.19) |
In [28], they tested different CG methods for large-scale unconstrained optimization problems and concluded that the hybrid CG method (4.18) gave the best results.
Next hybrid CG method, proposed by Dai in [23], employs either the DY scheme or the CD scheme:
\begin{equation} \beta_k = \frac{\|\mathbf{g}_k\|^2}{\max \{ \mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}, -\mathbf{g}_{k-1}^ \mathrm{T} \mathbf{d}_{k-1} \}}. \end{equation} | (4.20) |
A modified CG method defined as the hybridization of known LS and CD conjugate gradient methods is presented and analyzed in [130] by the rule
\begin{equation} \beta_k^{\mathrm{LSCD}}\! = \!\max\{0,\min\{\beta_k^{\mathrm{LS}},\beta_k^{\mathrm{CD}}\}\}. \end{equation} | (4.21) |
CG methods can be combined together by introducing one or more parameters. In [32,33], Dai and Yuan proposed a one-parameter family of CG methods with
\begin{equation} \beta_k = \frac{\|\mathbf{g}_k\|^2}{ \theta_k \|\mathbf{g}_{k-1}\|^2 + (1-\theta_k) \mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}, \end{equation} | (4.22) |
where
Another hybrid method, proposed by Delladji, Belloufi and Sellami in [39], exploits either the PRP scheme or the HZ scheme, as
\begin{equation} \begin{aligned} \beta_k^{\mathrm{hPRPHZ}}&\! = \! \theta_k \beta_k^{\mathrm{PRP}} + (1-\theta_k) \beta_k^{\mathrm{HZ}}\\ &\! = \! \theta_k \dfrac{{\mathbf y}_{k-1}^ \mathrm{T} {\mathbf g}_k}{\|{\mathbf g}_{k-1}\|^2} + (1-\theta_k) \frac{1}{{\mathbf d}_{k-1}^ \mathrm{T} {\mathbf y}_{k-1}}\left({\mathbf y}_{k-1} - 2{\mathbf d}_{k-1}\frac{\|{\mathbf y}_{k-1}\|^2}{{\mathbf d}_{k-1}^ \mathrm{T} {\mathbf y}_{k-1}}\right)^ \mathrm{T}{\mathbf g}_{k}, \end{aligned} \end{equation} | (4.23) |
in which
Nazareth in [89] proposed a two-parameter family of CGUP parameters using convex combinations of numerators and denominators, as
\begin{equation} \beta_k = \frac{\nu_k \|\mathbf{g}_k\|^2 + (1-\nu_k)\mathbf{g}_k^ \mathrm{T} \mathbf{y}_{k-1}}{ \theta_k \|\mathbf{g}_{k-1}\|^2 + (1-\theta_k) \mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}, \end{equation} | (4.24) |
where
In [113], the authors proposed hybrid CG methods where the search direction
\begin{eqnarray} {\mathbf d}_k &\!: = \!&\mathfrak{D}(\beta_k,{\mathbf g}_k,{\mathbf d}_{k-1}) \! = \!-\left(1+\beta_k\dfrac{{\mathbf g}_k^ \mathrm{T} {\mathbf d}_{k-1}}{\|{\mathbf g}_k\|^2}\right){\mathbf g}_k+\beta_k{\mathbf d}_{k-1} , \end{eqnarray} | (4.25) |
\begin{eqnarray} {\mathbf d}_k &\!: = \!&\mathfrak{D}_1(\beta_k,{\mathbf g}_k,{\mathbf d}_{k-1}) \! = \!-B_k{\mathbf g}_k +\mathfrak{D}(\beta_k,{\mathbf g}_k,{\mathbf d}_{k-1}) , \end{eqnarray} | (4.26) |
and
\begin{equation} \begin{aligned} \beta_k^{\mathrm{LSCD}}&\! = \!\max\{0,\min\{\beta_k^{\mathrm{LS}},\beta_k^{\mathrm{CD}}\}\},\\ {\mathbf d}_k&\! = \!\mathfrak{d}(\beta_k^{\mathrm{LSCD}},{\mathbf g}_k,{\mathbf d}_{k-1}). \end{aligned} \end{equation} | (4.27) |
The resulting method is known as the MLSCD method with the search direction
\begin{equation} {\mathbf d}_k\!: = \!\mathfrak{D}(\beta_k^{\mathrm{LSCD}},{\mathbf g}_k,{\mathbf d}_{k-1}). \end{equation} | (4.28) |
In general, the idea is based on the replacement of
Now we give the general framework of the CG class of methods.
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
The first iteration in CG methods is a gradient step. Also, it is common to restart the algorithm periodically by taking the gradient step.
Known fact is that CG iterates are better than QN methods in terms of the CPU time. Moreover, BFGS updates require greater memory space usage than CG. On the other hand, the QN methods require lesser number of iterations as well as the number of function evaluations. For this purpose, one of the modern trends in defining new CG methods is usage of the BFGS update in defining new rules for defining
{\mathbf d}_k\!: = \!\begin{cases} -B_k{\mathbf g}_k, \,\,k\! = \!0,\\ \mathfrak{D}_1(\beta_k^{\mathrm{LSCD}},{\mathbf g}_k,{\mathbf d}_{k-1}), \,\,k\geq 1. \end{cases} |
In [113], the authors investigated hybrid CG algorithms based on the modified search direction which is defined using one of the following two hybridizations:
\begin{eqnarray} {\mathbf d}_k &\!: = \!&\mathfrak{D}(\beta_k,{\mathbf g}_k,{\mathbf d}_{k-1}) \! = \!-\left(1+\beta_k\dfrac{{\mathbf g}_k^ \mathrm{T} {\mathbf d}_{k-1}}{\|{\mathbf g}_k\|^2}\right){\mathbf g}_k+\beta_k{\mathbf d}_{k-1} , \end{eqnarray} | (4.29) |
\begin{eqnarray} {\mathbf d}_k &\!: = \!&\mathfrak{D}_1(\beta_k,{\mathbf g}_k,{\mathbf d}_{k-1}) \! = \!-B_k{\mathbf g}_k +\mathfrak{D}(\beta_k,{\mathbf g}_k,{\mathbf d}_{k-1}) , \end{eqnarray} | (4.30) |
as well as on the usage of
{\mathbf d}_k\!: = \!\begin{cases} -B_k{\mathbf g}_k, \,\,k\! = \!0,\\ \mathfrak{D}_1(\beta_k^{\mathrm{LSCD}},{\mathbf g}_k,{\mathbf d}_{k-1}), \,\,k\geq 1. \end{cases} |
Below we give some assumptions related to line search procedures.
Assumption 5.1.
\begin{equation} \|\mathbf{g}(\mathbf{u})-\mathbf{g}(\mathbf{v})\|\leq L\|\mathbf{u}-\mathbf{v}\|,\,\, \forall u,v\in\mathcal{N}. \end{equation} | (5.1) |
Assumption 5.1 initiates the existence of a positive constants
\begin{equation} \|\mathbf{u}-\mathbf{v}\|\leq D,\,\,\forall\,\, \mathbf{u},\mathbf{v}\in\mathcal{N} \end{equation} | (5.2) |
and
\begin{equation} \|\mathbf{g}(\mathbf{u})\|\leq\gamma,\,\,\forall\,\, \mathbf{u}\in\mathcal{N}. \end{equation} | (5.3) |
The proof of Lemma 5.1 is given in [142] and known as the Zoutendijk condition.
Lemma 5.1. [17,142] Let Assumption 5.1 be accomplished and the points
\begin{equation} \sum\limits_{k\! = \!0}^{\infty} {\dfrac{\|\mathbf{g}_k\|^4}{\|\mathbf{d}_k\|^2}} < +\infty. \end{equation} | (5.4) |
In this subsection, we recall properties of the HS, PRP and LS methods. If we look at the chronological development presented in Table 2, a clear observation is that HS, PRP and LS methods involve the expression
Property (*) [49] Let a method defined by (1.2) and (1.10) satisfies
\begin{equation} 0 < \gamma\leq \|\mathbf{g}_k\|\leq \bar{\gamma}, \end{equation} | (5.5) |
for all
\begin{equation} |\beta_k| \leq b, \end{equation} | (5.6) |
and
\begin{equation} \|\mathbf{s}_{k-1}\| \leq \lambda \Rightarrow |\beta_k| \leq \frac{1}{2b}. \end{equation} | (5.7) |
In order to prove that conjugate gradient methods have Property (*), it suffices to show that there exists a constant
\begin{equation} |\beta_k| \leq c \|\mathbf{s}_{k-1}\| \,\,\, \text{for all} \,\,\,k, \end{equation} | (5.8) |
under the assumption (5.5). Then, by putting
\begin{equation} \|\mathbf{s}_{k-1}\| \leq \lambda \Rightarrow |\beta_k| \leq \frac{1}{2b}, \end{equation} | (5.9) |
which confirms the Property (*). It is easily shown that (5.8) holds for the HS, PRP and LS methods, and thus these methods have Property (*).
Next, we give the global convergence theorem of CG methods satisfying Property (*). The proof of Theorem 5.2 is given in [49].
Theorem 5.2. Consider any conjugate gradient method (1.2), (1.10) that satisfies the following conditions:
If the Lipschitz and Boundedness Assumptions hold, then the iterates are globally convergent.
In this section, we recall properties of the FR, CD and DY methods. If we look at the chronological development presented in Table 2, it is observable that FR, CD and DY methods involve the value
Proposition 1. The following statements hold:
\begin{equation} - \frac{1}{1-\sigma_1} \leq \frac{\mathbf{g}_k^T \mathbf{d}_k}{\|\mathbf{g}_k\|^2} \leq -1 + \frac{\sigma_2}{1-\sigma_1}. \end{equation} | (5.10) |
\begin{equation} - \frac{1}{1-\sigma_1} \leq \frac{\mathbf{g}_k^T \mathbf{d}_k}{\|\mathbf{g}_k\|^2} \leq - \frac{1}{1+\sigma_2}. \end{equation} | (5.11) |
\begin{equation} -1 - \sigma_1 \leq \frac{\mathbf{g}_k^T \mathbf{d}_k}{\|\mathbf{g}_k\|^2} \leq -1 +\sigma_2. \end{equation} | (5.12) |
Proposition 1 implies that the FR, CD and DY methods satisfy the sufficient descent condition (1.6), dependent on line searches.
We now give the global convergence properties of the FR and DY methods, which were proven in [2] and [30], respectively.
Theorem 5.3. Suppose that Assumption 5.1 holds. Let the sequence
\begin{equation} \liminf\limits_{k\rightarrow\infty}\|\mathbf{g}_k\| = 0. \end{equation} | (5.13) |
Methods surveyed in Subsection 5.2 exhibit strong convergence properties, but they may not be efficient in practice due to appearance of jamming. On the other hand however, methods given in Subsection 5.1 may not be convergent in general, they often perform better than the methods restated in Subsection 5.2. Details about this fact are given in [65]. This clearly implies that combinations of these methods have been proposed with the aim of exploiting attractive features of each family of methods.
The following assumptions will be commonly used in the subsequent convergence analysis of DHSDL, DLSDL, MHSDL and MLSDL methods.
It is supposed that the conditions in Assumption 5.1 hold. Assumption 5.1 initiates the existence of positive constants
By the uniform convexity of
\begin{equation} (\mathbf{g}(\mathbf{u})-\mathbf{g}(\mathbf{v}))^ \mathrm{T}(\mathbf{u}-\mathbf{v})\geq\theta\|\mathbf{u}-\mathbf{v}\|^2,\,\,\text{for all}\,\, \mathbf{u},\mathbf{v}\in S, \end{equation} | (5.14) |
or equivalently,
\begin{equation} f(\mathbf{u})\geq f(\mathbf{v})+\mathbf{g}(\mathbf{v})^ \mathrm{T}(\mathbf{u}-\mathbf{v})+\frac{\theta}{2}\|\mathbf{u}-\mathbf{v}\|^2,\,\,\text{for all}\,\, \mathbf{u},\mathbf{v}\in S. \end{equation} | (5.15) |
It follows from (5.14) and (5.15) that
\begin{equation} \mathbf{s}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}\geq\theta\|\mathbf{s}_{k-1}\|^2 \end{equation} | (5.16) |
and
\begin{equation} f_{k-1}-f_{k}\geq -\mathbf{g}_{k}^T \mathbf{s}_{k-1}+\frac{\theta}{2}\|\mathbf{s}_{k-1}\|^2. \end{equation} | (5.17) |
By (5.1) and (5.16), we have
\begin{equation} \theta\|\mathbf{s}_{k-1}\|^2 \leq \mathbf{s}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}\leq L\|\mathbf{s}_{k-1}\|^2, \end{equation} | (5.18) |
where the inequalities imply
Further, (5.18) implies
\begin{equation} \mathbf{s}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1} = \alpha_{k-1}\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1} > 0. \end{equation} | (5.19) |
From
\begin{equation} \mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1} > 0. \end{equation} | (5.20) |
In order to improve presentation, an arbitrary method defined by (1.2) and (1.10) will be denoted by
\begin{equation} \beta_k\! = \!\beta_k'-t\frac{\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}. \end{equation} | (5.21) |
Lemma 5.4. [17,142] Let Assumption 5.1 be accomplished and the points
Lemma 5.5. The parameters
\begin{equation} 0\leq \beta_k'\leq\frac{\|\mathbf{g}_k\|^2}{\lambda |\mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1}|},\ \lambda \geq 1 \end{equation} | (5.22) |
in each iterative step
Proof. In the case
0\!\leq\! {\beta}^{\mathrm{DHS}}_k,\ \ {\beta}^{\mathrm{DLS}}_k \!\leq\!\frac{\|\mathbf{g}_k\|^2}{\mu |\mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1}|} |
are known from [29].
For
0 \leq {\beta}^{\mathrm{MHS}}_k = \frac{{\left\|\mathbf{g}_k\right\|}^2-\frac{\left\|\mathbf{g}_k\ \right\|}{\left\|\mathbf{g}_{k-1}\right\|}\left|\mathbf{g}_k^ \mathrm{T} \mathbf{g}_{k-1}\right|}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}} \leq \frac{\|\mathbf{g}_k\|^2}{|\mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1}|}. |
For
0 \leq {\beta}^{\mathrm{MLS}}_k = \frac{{\left\|\mathbf{g}_k\right\|}^2-\frac{\left\|\mathbf{g}_k\ \right\|}{\left\|\mathbf{g}_{k-1}\right\|}\left|\mathbf{g}_k^ \mathrm{T} \mathbf{g}_{k-1}\right|}{-\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{g}_{k-1}} \leq \frac{\|\mathbf{g}_k\|^2}{|\mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1}|}. |
Since
Lemma 5.6. The iterations
\begin{equation} \mathbf{g}_k^ \mathrm{T} \mathbf{d}_k\!\leq\! - c\|\mathbf{g}_k\|^2 \end{equation} | (5.23) |
for some
Proof. The inequality (5.23) will be verified by induction. In the initial situation
\begin{equation} \mathbf{g}_k^ \mathrm{T} \mathbf{d}_k\! = \! - \|\mathbf{g}_k\|^2 + \beta_k \mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1}. \end{equation} | (5.24) |
An application of (5.21) and
\begin{aligned} \mathbf{g}_k^ \mathrm{T} \mathbf{d}_k &\! = \! - \|\mathbf{g}_k\|^2 + \left({\beta}_k' - t\frac{\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}\right) \mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1}\\ &\! = \! - \|\mathbf{g}_k\|^2 + {\beta}_k' \mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1} - t\frac{\alpha_{k-1} \mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}} \mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1}\\ &\! = \! - \|\mathbf{g}_k\|^2 + {\beta}_k' \mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1} - t\frac{\alpha_{k-1} (\mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1})^2}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}.\\ \end{aligned} |
From (5.20),
\begin{equation} t\frac{\alpha_{k-1} (\mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1})^2}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}} > 0. \end{equation} | (5.25) |
Now (5.25) in conjunction with (5.22) implies
\begin{aligned} \mathbf{g}_k^ \mathrm{T} \mathbf{d}_k &\!\leq\! - \|\mathbf{g}_k\|^2 + {\beta}_k' \mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1}\\ &\!\leq\! - \|\mathbf{g}_k\|^2 + \frac{\|\mathbf{g}_k\|^2}{\lambda |\mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1}|} |\mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1}|\\ &\!\leq\! - \|\mathbf{g}_k\|^2 + \frac{\|\mathbf{g}_k\|^2}{\lambda }\\ &\! = \! -\left(1- \frac{1}{\lambda}\right) \|\mathbf{g}_k\|^2. \end{aligned} |
In view of
Lemma 5.7. The parameter
\begin{equation} \beta_k\leq \frac{{\left\|\mathbf{g}_k\right\|}^2-\frac{\left\|\mathbf{g}_k\right\|}{\left\|\mathbf{g}_{k-1}\right\|}\left|\mathbf{g}_k^ \mathrm{T} \mathbf{g}_{k-1}\right|-t\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}. \end{equation} | (5.26) |
Proof. For
\begin{equation} \begin{aligned} \beta_k^{\mathrm{DHSDL}}&\! = \!\frac{{\left\|\mathbf{g}_k\right\|}^2-\frac{\left\|\mathbf{g}_k\right\|}{\left\|\mathbf{g}_{k-1}\right\|}\left|\mathbf{g}_k^ \mathrm{T} \mathbf{g}_{k-1}\right|}{\mu \left|\mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1}\right|+\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}-t\frac{\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}\\ &\!\leq\! \frac{{\left\|\mathbf{g}_k\right\|}^2-\frac{\left\|\mathbf{g}_k\right\|}{\left\|\mathbf{g}_{k-1}\right\|}\left|\mathbf{g}_k^ \mathrm{T} \mathbf{g}_{k-1}\right|}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}-t\frac{\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}\\ &\! = \! \frac{{\left\|\mathbf{g}_k\right\|}^2-\frac{\left\|\mathbf{g}_k\right\|}{\left\|\mathbf{g}_{k-1}\right\|}\left|\mathbf{g}_k^ \mathrm{T} \mathbf{g}_{k-1}\right|-t\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}. \end{aligned} \end{equation} | (5.27) |
As
\begin{equation} \begin{aligned} \mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}&\! = \! \mathbf{d}_{k-1}^ \mathrm{T} (\mathbf{g}_k - \mathbf{g}_{k-1})\\ &\! = \! \mathbf{d}_{k-1}^ \mathrm{T} \mathbf{g}_k - \mathbf{d}_{k-1}^ \mathrm{T} \mathbf{g}_{k-1}\\ &\!\leq\! \left|\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{g}_k\right| - \mathbf{d}_{k-1}^ \mathrm{T} \mathbf{g}_{k-1}\\ &\!\leq\! \mu\left|\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{g}_k\right| - \mathbf{d}_{k-1}^ \mathrm{T} \mathbf{g}_{k-1}. \end{aligned} \end{equation} | (5.28) |
Using inequalities (5.20) and (5.28) in
\begin{equation} \begin{aligned} \beta_k^{\mathrm{DLSDL}}&\! = \!\frac{{\left\|\mathbf{g}_k\right\|}^2-\frac{\left\|\mathbf{g}_k\ \right\|}{\left\|\mathbf{g}_{k-1}\right\|}\left|\mathbf{g}_k^ \mathrm{T} \mathbf{g}_{k-1}\right|}{\mu \left|\mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1}\right|-\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{g}_{k-1}}-t\frac{\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}\\ &\!\leq\! \frac{{\left\|\mathbf{g}_k\right\|}^2-\frac{\left\|\mathbf{g}_k\ \right\|}{\left\|\mathbf{g}_{k-1}\right\|}\left|\mathbf{g}_k^ \mathrm{T} \mathbf{g}_{k-1}\right|}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}-t\frac{\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}\\ &\! = \! \frac{{\left\|\mathbf{g}_k\right\|}^2-\frac{\left\|\mathbf{g}_k\right\|}{\left\|\mathbf{g}_{k-1}\right\|}\left|\mathbf{g}_k^ \mathrm{T} \mathbf{g}_{k-1}\right|-t\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}. \end{aligned} \end{equation} | (5.29) |
The following conclusion is valid in the case
\begin{equation} \begin{aligned} \beta_k^{\mathrm{MHSDL}}&\! = \! \frac{{\left\|\mathbf{g}_k\right\|}^2-\frac{\left\|\mathbf{g}_k\ \right\|}{\left\|\mathbf{g}_{k-1}\right\|}\left|\mathbf{g}_k^ \mathrm{T} \mathbf{g}_{k-1}\right|}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}-t\frac{\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}\\ &\! = \! \frac{{\left\|\mathbf{g}_k\right\|}^2-\frac{\left\|\mathbf{g}_k\right\|}{\left\|\mathbf{g}_{k-1}\right\|}\left|\mathbf{g}_k^ \mathrm{T} \mathbf{g}_{k-1}\right|-t\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}. \end{aligned} \end{equation} | (5.30) |
Therefore, the inequality (5.26) follows from (5.27), (5.29) and (5.30).
The global convergence of the proposed methods is confirmed by Theorem 5.8.
Theorem 5.8. Assume that Assumption 5.1 is true and
\begin{equation} \liminf\limits_{k\rightarrow\infty}{\|\mathbf{g}_k\|}\! = \!0. \end{equation} | (5.31) |
Proof. Assume that (5.31) is not true. This implies the existence of a constant
\begin{equation} \|\mathbf{g}_k\|\geq c_1,\,\,\,\text{for all}\,\,\,k. \end{equation} | (5.32) |
Squaring both sides of (1.10) implies
\begin{equation} \|\mathbf{d}_k\|^2 \! = \! \|\mathbf{g}_k\|^2 - 2 \beta_k \mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1}+ (\beta_k)^2 \|\mathbf{d}_{k-1}\|^2. \end{equation} | (5.33) |
Taking into account (5.22), one can obtain
\begin{equation} \begin{aligned} - 2 \beta_k\, \mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1} &\! = \! - 2 \left({\beta}_k' - t\frac{\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}} \right) \mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1}\\ &\! = \! - 2 \left({\beta}_k' \mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1} - t\frac{\alpha_k (\mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1})^2}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}} \right).\\ \end{aligned} \end{equation} | (5.34) |
Now from (5.25), with respect to
\begin{equation} \begin{aligned} - 2 \beta_k \mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1}&\!\leq\! 2 |{\beta}_k'| |\mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1}|\\ &\!\leq\! 2 \frac{\|\mathbf{g}_k\|^2}{\lambda |\mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1}|} |\mathbf{g}_k^ \mathrm{T} \mathbf{d}_{k-1}|\\ &\!\leq\! 2 \frac{\|\mathbf{g}_k\|^2}{\lambda}. \end{aligned} \end{equation} | (5.35) |
Case 1. In the cases
\begin{equation} \begin{aligned} \beta_k &\leq \frac{{\left\|\mathbf{g}_k\right\|}^2-\frac{\left\|\mathbf{g}_k\right\|}{\left\|\mathbf{g}_{k-1}\right\|}\left|\mathbf{g}_k^ \mathrm{T} \mathbf{g}_{k-1}\right|-t\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}\\ &\leq \left|\frac{\mathbf{g}_k^T \mathbf{g}_k-\frac{\left\|\mathbf{g}_k\right\|}{\left\|\mathbf{g}_{k-1}\right\|}\left|\mathbf{g}_k^ \mathrm{T} \mathbf{g}_{k-1}\right|-t\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}\right|\\ &\leq \frac{\left|\mathbf{g}_k^T\left(\mathbf{g}_k-\frac{\left\|\mathbf{g}_k\right\|}{\left\|\mathbf{g}_{k-1}\right\|}\mathbf{g}_{k-1}-t \mathbf{s}_{k-1}\right)\right|}{\theta\alpha_{k-1}{\left\|\mathbf{d}_{k-1}\right\|}^2}\\ & = \frac{\left|\mathbf{g}_k^T\left(\mathbf{g}_k-\mathbf{g}_{k-1}+\mathbf{g}_{k-1}-\frac{\left\|\mathbf{g}_k\right\|}{\left\|\mathbf{g}_{k-1}\right\|}\mathbf{g}_{k-1}-t \mathbf{s}_{k-1}\right)\right|}{\theta\alpha_{k-1}{\left\|\mathbf{d}_{k-1}\right\|}^2}\\ &\leq \frac{\left\|\mathbf{g}_k\right\|\left(\left\|\mathbf{g}_k-\mathbf{g}_{k-1}\right\| + \left\|\mathbf{g}_{k-1}\left(1-\frac{\left\|\mathbf{g}_k\right\|}{\left\|\mathbf{g}_{k-1}\right\|}\right)\right\|+t \left\|\mathbf{s}_{k-1}\right\|\right)}{\theta\alpha_{k-1}{\left\|\mathbf{d}_{k-1}\right\|}^2}\\ &\leq \frac{\left\|\mathbf{g}_k\right\|\left(\left\|\mathbf{g}_k-\mathbf{g}_{k-1}\right\| + \left|1-\frac{\left\|\mathbf{g}_k\right\|}{\left\|\mathbf{g}_{k-1}\right\|}\right| \left\|\mathbf{g}_{k-1}\right\|+t \left\|\mathbf{s}_{k-1}\right\|\right)}{\theta\alpha_{k-1}{\left\|\mathbf{d}_{k-1}\right\|}^2}\\ &\leq \frac{\left\|\mathbf{g}_k\right\|\left(\left\|\mathbf{g}_k-\mathbf{g}_{k-1}\right\| + \left|\left\|\mathbf{g}_{k-1}\right\|-\left\|\mathbf{g}_k\right\|\right| + t \left\|\mathbf{s}_{k-1}\right\|\right)}{\theta\alpha_{k-1}{\left\|\mathbf{d}_{k-1}\right\|}^2}\\ &\leq \frac{\left\|\mathbf{g}_k\right\|\left(\left\|\mathbf{g}_k-\mathbf{g}_{k-1}\right\| + \left\|\mathbf{g}_{k-1}-\mathbf{g}_k\right\| + t \left\|\mathbf{s}_{k-1}\right\|\right)}{\theta\alpha_{k-1}{\left\|\mathbf{d}_{k-1}\right\|}^2} \end{aligned} \end{equation} | (5.36) |
So,
\begin{equation} \begin{aligned} \beta_k &\leq \frac{\left\|\mathbf{g}_k\right\|\left(2\left\|\mathbf{g}_k-\mathbf{g}_{k-1}\right\| + t \left\|\mathbf{s}_{k-1}\right\|\right)}{\theta\alpha_{k-1}{\left\|\mathbf{d}_{k-1}\right\|}^2}\\ &\leq \frac{\left\|\mathbf{g}_k\right\|\left(2L\left\|\mathbf{s}_{k-1}\right\| + t \left\|\mathbf{s}_{k-1}\right\|\right)}{\theta\alpha_{k-1}{\left\|\mathbf{d}_{k-1}\right\|}^2}\\ &\leq \frac{\left(2L + t\right) \left\|\mathbf{g}_k\right\|\left\|\mathbf{s}_{k-1}\right\|}{\theta\alpha_{k-1}{\left\|\mathbf{d}_{k-1}\right\|}^2} \end{aligned} \end{equation} | (5.37) |
\begin{equation*} \begin{aligned}& = \frac{\left(2L + t\right) \left\|\mathbf{g}_k\right\| \alpha_{k-1}\left\|\mathbf{d}_{k-1}\right\|}{\theta\alpha_{k-1}{\left\|\mathbf{d}_{k-1}\right\|}^2}\\ & = \frac{\left(2L + t\right) \left\|\mathbf{g}_k\right\|}{\theta\left\|\mathbf{d}_{k-1}\right\|}. \end{aligned} \end{equation*} |
Using (5.35) and (5.37) in (5.33), we obtain
\begin{equation} \begin{aligned} \|\mathbf{d}_k\|^2 &\!\leq\! \|\mathbf{g}_k\|^2 + 2 \frac{\|\mathbf{g}_k\|^2}{\lambda}+ \frac{\left(2L + t\right)^2 {\left\|\mathbf{g}_k\right\|}^2}{\theta^2{\left\|\mathbf{d}_{k-1}\right\|}^2} \|\mathbf{d}_{k-1}\|^2\\ &\!\leq\! \|\mathbf{g}_k\|^2 + 2 \frac{\|\mathbf{g}_k\|^2}{\lambda}+ \frac{\left(2L + t\right)^2}{\theta^2}{\left\|\mathbf{g}_k\right\|}^2\\ &\!\leq\! \left(1 + \frac{2}{\lambda}+ \frac{\left(2L + t\right)^2}{\theta^2}\right){\left\|\mathbf{g}_k\right\|}^2\\ &\!\leq\! \left(\frac{\lambda+2}{\lambda}+ \frac{\left(2L + t\right)^2}{\theta^2}\right){\left\|\mathbf{g}_k\right\|}^2\\ &\!\leq\! \frac{(\lambda+2)\theta^2 + \lambda\left(2L + t\right)^2}{\lambda\theta^2}{\left\|\mathbf{g}_k\right\|}^2. \end{aligned} \end{equation} | (5.38) |
Next, dividing both sides of (5.38) by
\begin{equation} \begin{aligned} \frac{\|\mathbf{d}_k\|^2}{\|\mathbf{g}_k\|^4} &\!\leq\! \frac{(\lambda+2)\theta^2 + \lambda\left(2L + t\right)^2}{\lambda\theta^2} \cdot \frac{1}{c_1^2}\\ \frac{\|\mathbf{g}_k\|^4}{\|\mathbf{d}_k\|^2} &\!\geq\!\frac{\lambda\theta^2\cdot c_1^2}{(\lambda+2)\theta^2 + \lambda\left(2L + t\right)^2} . \end{aligned} \end{equation} | (5.39) |
The inequalities in (5.39) imply
\begin{equation} \sum\limits_{k = 0}^{\infty}\frac{\|\mathbf{g}_k\|^4}{\|\mathbf{d}_k\|^2}\geq \sum\limits_{k = 0}^{\infty}\frac{\lambda\theta^2\cdot c_1^2}{(\lambda+2)\theta^2 + \lambda\left(2L + t\right)^2}\! = \!\infty. \end{equation} | (5.40) |
Therefore,
Case 2. In the cases
\begin{equation*} \begin{aligned} {\beta}^{\mathrm{MLSDL}}_k &\leq \left|\frac{{\left\|\mathbf{g}_k\right\|}^2-\frac{\left\|\mathbf{g}_k\right\|}{\left\|\mathbf{g}_{k-1}\right\|}\left|\mathbf{g}_k^ \mathrm{T} \mathbf{g}_{k-1}\right|}{-\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{g}_{k-1}}-t\frac{\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}}{\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}}\right|\\ &\leq \left|\frac{\mathbf{g}_k^T \mathbf{g}_k-\frac{\left\|\mathbf{g}_k\right\|}{\left\|\mathbf{g}_{k-1}\right\|}\left|\mathbf{g}_k^ \mathrm{T} \mathbf{g}_{k-1}\right|}{-\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{g}_{k-1}}\right|+t \frac{\left|\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}\right|}{\left|\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}\right|}\\ &\leq \frac{\left|\mathbf{g}_k^T \left(\mathbf{g}_k-\frac{\left\|\mathbf{g}_k\right\|}{\left\|\mathbf{g}_{k-1}\right\|}\mathbf{g}_{k-1}\right)\right|}{\left|-\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{g}_{k-1}\right|}+t \frac{\left|\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}\right|}{\left|\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}\right|}\\ & = \frac{\left|\mathbf{g}_k^T \left(\mathbf{g}_k-\mathbf{g}_{k-1}+\mathbf{g}_{k-1}-\frac{\left\|\mathbf{g}_k\right\|}{\left\|\mathbf{g}_{k-1}\right\|}\mathbf{g}_{k-1}\right)\right|}{\left|-\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{g}_{k-1}\right|}+t \frac{\left|\mathbf{g}_k^ \mathrm{T} \mathbf{s}_{k-1}\right|}{\left|\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}\right|} \end{aligned} \end{equation*} |
\begin{equation} \begin{aligned}&\leq \frac{\left\|\mathbf{g}_k\right\| \left(\left\|\mathbf{g}_k-\mathbf{g}_{k-1}\right\|+ \left\|\mathbf{g}_{k-1}-\frac{\left\|\mathbf{g}_k\right\|}{\left\|\mathbf{g}_{k-1}\right\|}\mathbf{g}_{k-1}\right\|\right)}{\left|-\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{g}_{k-1}\right|}+t \frac{\left\|\mathbf{g}_k\right\| \left\|\mathbf{s}_{k-1}\right\|}{\left|\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}\right|}\\ &\leq \frac{\left\|\mathbf{g}_k\right\| \left(\left\|\mathbf{g}_k-\mathbf{g}_{k-1}\right\|+ \left\|\mathbf{g}_{k-1}(1-\frac{\left\|\mathbf{g}_k\right\|}{\left\|\mathbf{g}_{k-1}\right\|})\right\|\right)}{\left|-\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{g}_{k-1}\right|}+t \frac{\left\|\mathbf{g}_k\right\| \left\|\mathbf{s}_{k-1}\right\|}{\left|\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}\right|}\\ &\leq \frac{\left\|\mathbf{g}_k\right\| \left(\left\|\mathbf{g}_k-\mathbf{g}_{k-1}\right\|+ \left|1-\frac{\left\|\mathbf{g}_k\right\|}{\left\|\mathbf{g}_{k-1}\right\|}\right|\left\|\mathbf{g}_{k-1}\right\|\right)}{\left|-\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{g}_{k-1}\right|}+t \frac{\left\|\mathbf{g}_k\right\| \left\|\mathbf{s}_{k-1}\right\|}{\left|\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}\right|}\\ &\leq \frac{\left\|\mathbf{g}_k\right\| \left(\left\|\mathbf{g}_k-\mathbf{g}_{k-1}\right\|+ \left|\left\|\mathbf{g}_{k-1}\right\|-\left\|\mathbf{g}_k\right\|\right|\right)}{\left|-\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{g}_{k-1}\right|}+t \frac{\left\|\mathbf{g}_k\right\| \left\|\mathbf{s}_{k-1}\right\|}{\left|\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}\right|}\\ &\leq \frac{\left\|\mathbf{g}_k\right\| \left(\left\|\mathbf{g}_k-\mathbf{g}_{k-1}\right\|+ \left\|\mathbf{g}_k - \mathbf{g}_{k-1}\right\|\right)}{\left|-\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{g}_{k-1}\right|}+t \frac{\left\|\mathbf{g}_k\right\| \left\|\mathbf{s}_{k-1}\right\|}{\left|\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}\right|}\\ &\leq \frac{2\cdot\left\|\mathbf{g}_k\right\| \left\|\mathbf{g}_k-\mathbf{g}_{k-1}\right\|}{c{\left\|\mathbf{g}_{k-1}\right\|}^2}+t \frac{\left\|\mathbf{g}_k\right\| \left\|\mathbf{s}_{k-1}\right\|}{\left|\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}\right|}\\ &\leq \frac{2L\cdot\left\|\mathbf{g}_k\right\| \left\|\mathbf{s}_{k-1}\right\|}{c{\left\|\mathbf{g}_{k-1}\right\|}^2}+t \frac{\left\|\mathbf{g}_k\right\| \left\|\mathbf{s}_{k-1}\right\|}{\left|\mathbf{d}_{k-1}^ \mathrm{T} \mathbf{y}_{k-1}\right|}. \end{aligned} \end{equation} | (5.41) |
From (5.18), (5.19) and (5.41), we conclude
\begin{equation} \begin{aligned} {\beta}^{\mathrm{MLSDL}}_k &\leq \frac{2L\cdot\left\|\mathbf{g}_k\right\| \left\|\mathbf{s}_{k-1}\right\|}{c{\left\|\mathbf{g}_{k-1}\right\|}^2}+t \frac{\left\|\mathbf{g}_k\right\| \left\|\mathbf{s}_{k-1}\right\|}{\theta{\left\|\mathbf{s}_{k-1}\right\|}^2}\\ & = \frac{2L\cdot\left\|\mathbf{g}_k\right\| \left\|\mathbf{s}_{k-1}\right\|}{c{\left\|\mathbf{g}_{k-1}\right\|}^2}+ \frac{t\left\|\mathbf{g}_k\right\|}{\theta\left\|\mathbf{s}_{k-1}\right\|}\\ &\leq \frac{2L\cdot\left\|\mathbf{g}_k\right\| \left\|\mathbf{s}_{k-1}\right\|}{c\cdot c_1^2}+ \frac{t\left\|\mathbf{g}_k\right\|}{\theta\left\|\mathbf{s}_{k-1}\right\|}\\ &\leq \frac{2L\theta\cdot\left\|\mathbf{g}_k\right\| {\left\|\mathbf{s}_{k-1}\right\|}^2+ c\cdot c_1^2\cdot t\left\|\mathbf{g}_k\right\|}{c\cdot c_1^2 \cdot\theta\left\|\mathbf{s}_{k-1}\right\|}\\ &\leq \frac{\left(2L\theta\cdot {D}^2+ c\cdot c_1^2\cdot t\right)\left\|\mathbf{g}_k\right\|}{c\cdot c_1^2 \cdot\theta\cdot\alpha_{k-1}\left\|\mathbf{d}_{k-1}\right\|}. \end{aligned} \end{equation} | (5.42) |
Replacement of (5.35) and (5.42) in (5.33) leads to
\begin{equation} \begin{aligned} \|\mathbf{d}_k\|^2 &\!\leq\! \|\mathbf{g}_k\|^2 + 2 \frac{\|\mathbf{g}_k\|^2}{\lambda}+ \frac{\left(2L\theta\cdot {D}^2+ c\cdot c_1^2\cdot t\right)^2{\left\|\mathbf{g}_k\right\|}^2}{\left(c\cdot c_1^2 \cdot\theta\cdot\alpha_{k-1}\right)^2{\left\|\mathbf{d}_{k-1}\right\|}^2} \|\mathbf{d}_{k-1}\|^2\\ &\! = \! \|\mathbf{g}_k\|^2 + 2 \frac{\|\mathbf{g}_k\|^2}{\lambda}+ \frac{\left(2L\theta\cdot {D}^2+ c\cdot c_1^2\cdot t\right)^2}{\left(c\cdot c_1^2 \cdot\theta\cdot\alpha_{k-1}\right)^2}{\left\|\mathbf{g}_k\right\|}^2\\ &\! = \! \left(1 + \frac{2}{\lambda}+ \frac{\left(2L\theta\cdot {D}^2+ c\cdot c_1^2\cdot t\right)^2}{\left(c\cdot c_1^2 \cdot\theta\cdot\alpha_{k-1}\right)^2}\right){\left\|\mathbf{g}_k\right\|}^2\\ &\! = \! \left(\frac{\lambda+2}{\lambda}+ \frac{\left(2L\theta\cdot {D}^2+ c\cdot c_1^2\cdot t\right)^2}{\left(c\cdot c_1^2 \cdot\theta\cdot\alpha_{k-1}\right)^2}\right){\left\|\mathbf{g}_k\right\|}^2\\ &\! = \! \frac{(\lambda+2)\left(c\cdot c_1^2 \cdot\theta\cdot\alpha_{k-1}\right)^2 + \lambda\left(2L\theta\cdot {D}^2+ c\cdot c_1^2\cdot t\right)^2}{\lambda\left(c\cdot c_1^2 \cdot\theta\cdot\alpha_{k-1}\right)^2}{\left\|\mathbf{g}_k\right\|}^2. \end{aligned} \end{equation} | (5.43) |
Next, dividing both sides of (5.43) by
\begin{equation} \begin{aligned} \frac{\|\mathbf{d}_k\|^2}{\|\mathbf{g}_k\|^4} &\!\leq\! \frac{(\lambda+2)\left(c\cdot c_1^2 \cdot\theta\cdot\alpha_{k-1}\right)^2 + \lambda\left(2L\theta\cdot {D}^2+ c\cdot c_1^2\cdot t\right)^2}{\lambda\left(c\cdot c_1^2 \cdot\theta\cdot\alpha_{k-1}\right)^2} \cdot \frac{1}{c_1^2}\\ \frac{\|\mathbf{g}_k\|^4}{\|\mathbf{d}_k\|^2} &\!\geq\!\frac{\lambda\left(c\cdot c_1^2 \cdot\theta\cdot\alpha_{k-1}\right)^2\cdot c_1^2}{(\lambda+2)\left(c\cdot c_1^2 \cdot\theta\cdot\alpha_{k-1}\right)^2 + \lambda\left(2L\theta\cdot {D}^2+ c\cdot c_1^2\cdot t\right)^2}. \end{aligned} \end{equation} | (5.44) |
The inequalities in (5.44) imply
\begin{equation} \sum\limits_{k = 0}^{\infty}\frac{\|\mathbf{g}_k\|^4}{\|\mathbf{d}_k\|^2}\geq \sum\limits_{k = 0}^{\infty}\frac{\lambda\left(c\cdot c_1^2 \cdot\theta\cdot\alpha_{k-1}\right)^2\cdot c_1^2}{(\lambda+2)\left(c\cdot c_1^2 \cdot\theta\cdot\alpha_{k-1}\right)^2 + \lambda\left(2L\theta\cdot {D}^2+ c\cdot c_1^2\cdot t\right)^2} \! = \!\infty. \end{equation} | (5.45) |
Therefore,
The code used in the testing experiments is written in the software Matlab R2017a, and executed on the personal computer Workstation Intel Core i3
The numerical experiments are performed on contains functions presented in [3], where much of the problems are taken over from CUTEr collection [14]. Each test function is tested
Strong Wolfe line search use the following choice of parameters for all algorithms
We utilized the performance profile given in [43] to compare numerical results (IT, FE and CPU) for all tested methods. The upper curve of the selected performance profile corresponds to the method that shows the best performance.
BFGS, DFP, SR1 updates in QN methods with respect to different ILS strategies are compared in [40]. The main conclusion is that the BFGS method is superior to the others. Continuing such research, we compare the numerical performances obtained from AGD, MSM and SM methods, i.e, gradient methods with acceleration parameter. The numerical experiment contains
The uniform stopping criteria in this numerical experiments are
\|\mathbf{g}_k\| \leq 10^{-6} \ \ \ \mathrm{and}\ \ \ \frac{|f_{k+1}-f_k|}{1+|f_k|} \leq 10^{-16}. |
Summary numerical data generated by AGD, MSM and SM method, tried on 25 test functions, are arranged in Table 4.
IT profile | FE profile | CPU time | |||||||
Test function | AGD | MSM | SM | AGD | MSM | SM | AGD | MSM | SM |
Perturbed Quadratic | 353897 | 34828 | 59908 | 13916515 | 200106 | 337910 | 6756.047 | 116.281 | 185.641 |
Raydan 1 | 22620 | 26046 | 14918 | 431804 | 311260 | 81412 | 158.359 | 31.906 | 36.078 |
Diagonal 3 | 120416 | 7030 | 12827 | 4264718 | 38158 | 69906 | 5527.844 | 52.609 | 102.875 |
Generalized Tridiagonal 1 | 670 | 346 | 325 | 9334 | 1191 | 1094 | 11.344 | 1.469 | 1.203 |
Extended Tridiagonal 1 | 3564 | 1370 | 4206 | 14292 | 10989 | 35621 | 55.891 | 29.047 | 90.281 |
Extended TET | 443 | 156 | 156 | 3794 | 528 | 528 | 3.219 | 0.516 | 0.594 |
Diagonal 4 | 120 | 96 | 96 | 1332 | 636 | 636 | 0.781 | 0.203 | 0.141 |
Extended Himmelblau | 396 | 260 | 196 | 6897 | 976 | 668 | 1.953 | 0.297 | 0.188 |
Perturbed quadratic diagonal | 2542050 | 37454 | 44903 | 94921578 | 341299 | 460028 | 44978.750 | 139.625 | 185.266 |
Quadratic QF1 | 366183 | 36169 | 62927 | 13310016 | 208286 | 352975 | 12602.563 | 81.531 | 138.172 |
Extended quadratic penalty QP1 | 210 | 369 | 271 | 2613 | 2196 | 2326 | 1.266 | 1.000 | 0.797 |
Extended quadratic penalty QP2 | 395887 | 1674 | 3489 | 9852040 | 11491 | 25905 | 3558.734 | 3.516 | 6.547 |
Quadratic QF2 | 100286 | 32727 | 64076 | 3989239 | 183142 | 353935 | 1582.766 | 73.438 | 132.703 |
Extended quadratic exponential EP1 | 48 | 100 | 73 | 990 | 894 | 661 | 0.750 | 0.688 | 0.438 |
Extended Tridiagonal 2 | 1657 | 659 | 543 | 8166 | 2866 | 2728 | 3.719 | 1.047 | 1.031 |
ARWHEAD (CUTE) | 5667 | 430 | 270 | 214284 | 5322 | 3919 | 95.641 | 1.969 | 1.359 |
Almost Perturbed Quadratic | 356094 | 33652 | 60789 | 14003318 | 194876 | 338797 | 13337.125 | 73.047 | 133.516 |
LIARWHD (CUTE) | 1054019 | 3029 | 18691 | 47476667 | 27974 | 180457 | 27221.516 | 9.250 | 82.016 |
ENGVAL1 (CUTE) | 743 | 461 | 375 | 6882 | 2285 | 2702 | 3.906 | 1.047 | 1.188 |
QUARTC (CUTE) | 171 | 217 | 290 | 402 | 494 | 640 | 2.469 | 1.844 | 2.313 |
Generalized Quartic | 187 | 181 | 189 | 849 | 493 | 507 | 0.797 | 0.281 | 0.188 |
Diagonal 7 | 72 | 147 | 108 | 333 | 504 | 335 | 0.625 | 0.547 | 0.375 |
Diagonal 8 | 60 | 120 | 118 | 304 | 383 | 711 | 0.438 | 0.469 | 0.797 |
Full Hessian FH3 | 45 | 63 | 63 | 1352 | 566 | 631 | 1.438 | 0.391 | 0.391 |
Diagonal 9 | 329768 | 10540 | 13619 | 13144711 | 68189 | 89287 | 6353.172 | 43.609 | 38.672 |
Table 4 contains numerical results corresponding to IT, FE and CPU criteria for the AGD, MSM and SM methods. Figures 1, 2, 3 illustrate the performance profiles corresponding to the results in Table 4 corresponding to the criterion IT, GE and CPU, respectively.
From Table 4, we conclude that the AGD, MSM and SM methods have successfully solved all test functions.
Figure 1 presents the performance profiles of the IT of the AGD, MSM and SM methods. In this figure, it is observable that MSM method is best in
From Figure 1, it is observable that the graph of the MSM method comes first to the top, which means that the MSM is superior compared to other considered methods with respect to the IT profile.
Figure 2 presents the performance profiles of the FE of the AGD, MSM and SM methods. It is observable that MSM method is best in
Figure 3 presents the performance profiles of the CPU of the AGD, MSM and SM methods. It is obvious that MSM is winer in
From the previous analysis of the results shown in Table 4 and Figures 1-3, we can conclude that the MSM iterates are most efficient in terms of all three basic metrics: IT, FE and CPU. The MSM method has the smallest IT, FE and the CPU time compared to the other two methods on the most test functions.
The uniform stopping criterion during testing CG methods is
\|{ \mathbf{g}}_k\| \leq \epsilon, |
where
In this subsection, we compare the numerical results obtained from HS, PRP and LS methods, i.e., methods with
IT profile | FE profile | CPU time | |||||||
Test function | HS | PRP | LS | HS | PRP | LS | HS | PRP | LS |
Perturbed Quadratic | 1157 | 1157 | 6662 | 3481 | 3481 | 19996 | 0.234 | 0.719 | 1.438 |
Raydan 2 | NaN | 174 | 40 | NaN | 373 | 120 | NaN | 0.094 | 0.078 |
Diagonal 2 | NaN | 1721 | 5007 | NaN | 6594 | 15498 | NaN | 1.313 | 2.891 |
Extended Tridiagonal 1 | NaN | 170 | 17079 | NaN | 560 | 54812 | NaN | 0.422 | 13.641 |
Diagonal 4 | NaN | 70 | 1927 | NaN | 180 | 5739 | NaN | 0.078 | 0.391 |
Diagonal 5 | NaN | 154 | 30 | NaN | 338 | 90 | NaN | 0.172 | 0.078 |
Extended Himmelblau | 160 | 120 | 241 | 820 | 600 | 1043 | 0.172 | 0.125 | 0.172 |
Full Hessian FH2 | 5096 | 5686 | 348414 | 15294 | 17065 | 1045123 | 83.891 | 80.625 | 5081.875 |
Perturbed quadratic diagonal | 1472 | 1120 | 21667 | 4419 | 3363 | 65057 | 0.438 | 0.391 | 2.547 |
Quadratic QF1 | 1158 | 1158 | 5612 | 3484 | 3484 | 16813 | 0.281 | 0.313 | 1.047 |
Extended quadratic penalty QP2 | NaN | 533 | NaN | NaN | 5395 | NaN | NaN | 0.781 | NaN |
Quadratic QF2 | 2056 | 2311 | NaN | 9168 | 9862 | NaN | 0.969 | 0.859 | NaN |
Extended quadratic exponential EP1 | NaN | NaN | 70 | NaN | NaN | 350 | NaN | NaN | 0.141 |
TRIDIA (CUTE) | 6835 | 6744 | NaN | 20521 | 20248 | NaN | 1.438 | 1.094 | NaN |
Almost Perturbed Quadratic | 1158 | 1158 | 5996 | 3484 | 3484 | 17998 | 0.281 | 0.328 | 1.063 |
LIARWHD (CUTE) | NaN | 408 | 11498 | NaN | 4571 | 50814 | NaN | 0.438 | 2.969 |
POWER (CUTE) | 7781 | 7789 | 190882 | 23353 | 23377 | 572656 | 1.422 | 1.219 | 14.609 |
NONSCOMP (CUTE) | 4545 | 3647 | NaN | 15128 | 12433 | NaN | 0.875 | 0.656 | NaN |
QUARTC (CUTE) | NaN | 165 | 155 | NaN | 1347 | 1466 | NaN | 0.781 | 0.766 |
Diagonal 6 | NaN | 174 | 137 | NaN | 373 | 442 | NaN | 0.109 | 0.125 |
DIXON3DQ (CUTE) | NaN | 12595 | 12039 | NaN | 37714 | 36091 | NaN | 1.641 | 2.859 |
BIGGSB1 (CUTE) | NaN | 11454 | 11517 | NaN | 34293 | 34530 | NaN | 1.969 | 2.141 |
Generalized Quartic | NaN | 134 | 139 | NaN | 458 | 445 | NaN | 0.125 | 0.094 |
Diagonal 7 | NaN | 51 | 80 | NaN | 142 | 240 | NaN | 0.063 | 0.109 |
Diagonal 8 | NaN | 70 | 80 | NaN | 180 | 180 | NaN | 0.063 | 0.125 |
FLETCHCR (CUTE) | 18292 | 19084 | 20354 | 178305 | 170266 | 171992 | 8.859 | 6.203 | 7.484 |
Table 5 shows the numerical results (IT, FE and CPU) for the HS, PRP and LS methods.
Figures 4, 5 and 6 plot the performance profiles for the results in Table 5 with respect to IT, FE and CPU criterion, respectively.
Figure 4 presents the performance profiles of the IT correspondig to the HS, PRP and LS methods. In this figure, it is observable that PRP method is best in
Figure 5 presents the performance profiles of the FE of the HS, PRP and LS methods. It is observable that PRP method is best in
Figure 6 presents the performance profiles of the CPU of the HS, PRP and LS methods. It is obvious that PRP is best in
From the previous analysis of the results shown in Table 5 and Figures 4-6, we can conclude that the PRP method achieved the best and most efficient results in terms of all three basic metrics: IT, FE and CPU.
In this subsection, we compare the numerical results obtained from DY, FR and CD methods, i.e, methods with
IT profile | FE profile | CPU time | |||||||
Test function | DY | FR | CD | DY | FR | CD | DY | FR | CD |
Perturbed Quadratic | 1157 | 1157 | 1157 | 3481 | 3481 | 3481 | 0.469 | 0.609 | 0.531 |
Raydan 2 | 86 | 40 | 40 | 192 | 100 | 100 | 0.063 | 0.016 | 0.016 |
Diagonal 2 | 1636 | 3440 | 2058 | 4774 | 7982 | 8063 | 0.922 | 1.563 | 1.297 |
Extended Tridiagonal 1 | 2081 | 690 | 1140 | 4639 | 2022 | 2984 | 1.703 | 1.141 | 1.578 |
Diagonal 4 | 70 | 70 | 70 | 200 | 200 | 200 | 0.047 | 0.031 | 0.016 |
Diagonal 5 | 40 | 124 | 155 | 100 | 258 | 320 | 0.109 | 0.141 | 0.125 |
Extended Himmelblau | 383 | 339 | 207 | 1669 | 1467 | 961 | 0.219 | 0.172 | 0.172 |
Full Hessian FH2 | 4682 | 4868 | 4794 | 14054 | 14610 | 14390 | 65.938 | 66.469 | 65.922 |
Perturbed quadratic diagonal | 1036 | 1084 | 1276 | 3114 | 3258 | 3834 | 0.406 | 0.422 | 0.422 |
Quadratic QF1 | 1158 | 1158 | 1158 | 3484 | 3484 | 3484 | 0.297 | 0.297 | 0.328 |
Quadratic QF2 | NaN | NaN | 2349 | NaN | NaN | 10073 | NaN | NaN | 1.531 |
Extended quadratic exponential EP1 | NaN | 60 | 60 | NaN | 310 | 310 | NaN | 0.109 | 0.125 |
Almost Perturbed Quadratic | 1158 | 1158 | 1158 | 3484 | 3484 | 3484 | 0.422 | 0.453 | 0.391 |
LIARWHD (CUTE) | 2812 | 1202 | 1255 | 12366 | 7834 | 7379 | 0.938 | 1.000 | 1.109 |
POWER (CUTE) | 7779 | 7781 | 7782 | 23347 | 23353 | 23356 | 1.078 | 1.500 | 1.328 |
NONSCOMP (CUTE) | 2558 | 13483 | 10901 | 49960 | 43268 | 33413 | 1.203 | 1.406 | 1.422 |
QUARTC (CUTE) | 134 | 94 | 95 | 1132 | 901 | 916 | 0.688 | 0.672 | 0.563 |
Diagonal 6 | 86 | 40 | 40 | 192 | 100 | 100 | 0.047 | 0.063 | 0.063 |
DIXON3DQ (CUTE) | 16047 | 18776 | 19376 | 48172 | 56369 | 58176 | 2.266 | 2.516 | 2.734 |
BIGGSB1 (CUTE) | 15274 | 17835 | 18374 | 45853 | 53546 | 55170 | 2.875 | 2.922 | 2.484 |
Generalized Quartic | 142 | 214 | 173 | 497 | 712 | 589 | 0.078 | 0.172 | 0.109 |
Diagonal 7 | 50 | 50 | 50 | 160 | 160 | 160 | 0.063 | 0.047 | 0.094 |
Diagonal 8 | 50 | 40 | 40 | 160 | 130 | 130 | 0.109 | 0.125 | 0.063 |
Full Hessian FH3 | 43 | 43 | 43 | 139 | 139 | 139 | 0.063 | 0.109 | 0.109 |
FLETCHCR (CUTE) | NaN | NaN | 26793 | NaN | NaN | 240237 | NaN | NaN | 10.203 |
Table 6 contains numerical results (IT, FE and CPU) for the DY, FR and CD methods. Figures 7, 8 and 9 plot the performance profiles for the results in Table 6 with respect to profiles IT, FE and CPU, respectively.
From Figure 7, it is observable that the graph of the CD method comes first to the top, which signifies that the CD outperforms other considered methods with respect to the IT.
Figure 8 presents the performance profiles of the FE of the DY, FR and CD methods. From Figure 8, it is observed that the CD graph first comes to the highest level, which means that the CD possesses best performances with respect to the criterion FE.
Figure 9 presents the performance profiles of the CPU of the DY, FR and CD methods. Figure 9 demonstrates that the graph of the CD method first achieves the top level, so that the CD is winer with respect to the CPU.
From the previous analysis of the results shown in Table 6 and Figures 7-9, it is clear that the CD method achieved most efficient results in terms of all three basic metrics: IT, FE and CPU.
This subsection analyses numerical results obtained by running a MATLAB implementation with predefined conditions given at the beginning section. The following ten hybrid CG methods in the form of (1.2) and (1.10), which differ only in the choice of the CG parameter
- HCG1: The CG method with
- HCG2: The CG method with
- HCG3: The CG method with
- HCG4: The CG method with
- HCG5: The CG method with
- HCG6: The CG method with
- HCG7: The CG method with the parameter
- HCG8: The CG method with the parameter
- HCG9: The CG method with the parameter
- HCG10: The CG method with the parameter
The numerical experiment contains
Summary numerical results for hybrid CG methods, tried on 25 test functions, with respect to IT, FE and CPU profiles are presented in Table 7.
Test function | HCG1 | HCG2 | HCG3 | HCG4 | HCG5 | HCG6 | HCG7 | HCG8 | HCG9 | HCG10 |
Perturbed Quadratic | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 |
Raydan 2 | 40 | 40 | 40 | 57 | 78 | 81 | 40 | 69 | NaN | 126 |
Diagonal 2 | 1584 | 1581 | 1542 | 1488 | 1500 | 2110 | 2193 | 1843 | 1475 | 1453 |
Extended Tridiagonal 1 | 805 | 623 | 754 | 2110 | 2160 | 10129 | 1167 | 966 | NaN | 270 |
Diagonal 4 | 60 | 60 | 70 | 60 | 70 | 70 | 60 | 70 | NaN | 113 |
Diagonal 5 | 124 | 39 | 98 | 39 | 120 | 109 | 39 | 141 | 154 | 130 |
Extended Himmelblau | 145 | 139 | 111 | 161 | 181 | 207 | 159 | 381 | 109 | 108 |
Full Hessian FH2 | 5036 | 5036 | 5036 | 4820 | 4820 | 4800 | 4994 | 4789 | 5163 | 5705 |
Perturbed quadratic diagonal | 1228 | 1214 | 1266 | 934 | 1093 | 987 | 996 | 1016 | NaN | 2679 |
Quadratic QF1 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | NaN | 1158 |
Quadratic QF2 | 2125 | 2098 | 2174 | 1995 | 1991 | 2425 | 2378 | NaN | 2204 | 2034 |
TRIDIA (CUTE) | NaN | NaN | NaN | 6210 | 6210 | 5594 | NaN | NaN | 6748 | 7345 |
Almost Perturbed Quadratic | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 |
LIARWHD (CUTE) | 1367 | 817 | 1592 | 1024 | 1831 | 1774 | 531 | 2152 | NaN | 573 |
POWER (CUTE) | 7782 | 7782 | 7782 | 7779 | 7779 | 7802 | 7781 | 7780 | NaN | 7781 |
NONSCOMP (CUTE) | 10092 | 10746 | 8896 | 10466 | 9972 | 13390 | 11029 | 3520 | 3988 | 11411 |
QUARTC (CUTE) | 94 | 160 | 145 | 150 | 126 | 95 | 160 | 114 | 165 | 154 |
Diagonal 6 | 40 | 40 | 40 | 57 | 78 | 81 | 40 | 69 | NaN | 126 |
DIXON3DQ (CUTE) | 12182 | 5160 | 11257 | 5160 | 11977 | 14302 | 5160 | 17080 | NaN | 12264 |
BIGGSB1 (CUTE) | 10664 | 5160 | 10479 | 5160 | 11082 | 13600 | 5160 | 16192 | NaN | 11151 |
Generalized Quartic | 129 | 107 | 110 | 107 | 142 | 153 | 107 | 123 | 131 | 145 |
Diagonal 7 | 50 | NaN | 40 | NaN | 40 | 50 | NaN | 50 | 51 | 40 |
Diagonal 8 | 40 | 40 | 40 | 50 | NaN | 50 | 40 | NaN | NaN | 40 |
Full Hessian FH3 | 43 | 42 | 42 | 42 | 42 | 43 | 42 | 43 | NaN | NaN |
FLETCHCR (CUTE) | 17821 | 17632 | 18568 | 17272 | 17446 | 26794 | 24865 | NaN | 17315 | 20813 |
Figure 10 plot corresponding performance profiles IT for the results included in Table 7, in three columns denoted by IT.
From Figure 10, it is observable that the graph of the HCG6 method comes first to the top, which signifies that the HCG6 outperforms other considered methods with respect to the IT. However, if we look in more detail Figure 10, we can see that the HCG6 method does not have the best results, it has the best results in only
The numerical results of the hybrid CG methods with respect to the FE are arranged in Table 8.
Test function | HCG1 | HCG2 | HCG3 | HCG4 | HCG5 | HCG6 | HCG7 | HCG8 | HCG9 | HCG10 |
Perturbed Quadratic | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 |
Raydan 2 | 100 | 100 | 100 | 134 | 176 | 182 | 100 | 158 | NaN | 282 |
Diagonal 2 | 6136 | 6217 | 6006 | 5923 | 5944 | 8281 | 8594 | 4822 | 5711 | 5636 |
Extended Tridiagonal 1 | 2369 | 1991 | 2275 | 4678 | 4924 | 22418 | 3119 | 2661 | NaN | 869 |
Diagonal 4 | 170 | 170 | 200 | 170 | 200 | 200 | 170 | 200 | NaN | 339 |
Diagonal 5 | 258 | 88 | 206 | 88 | 270 | 228 | 88 | 292 | 338 | 270 |
Extended Himmelblau | 855 | 687 | 583 | 763 | 813 | 961 | 757 | 1613 | 567 | 594 |
Full Hessian FH2 | 15115 | 15115 | 15115 | 14467 | 14467 | 14407 | 14989 | 14374 | 15495 | 17122 |
Perturbed quadratic diagonal | 3686 | 3647 | 3805 | 2805 | 3282 | 2967 | 2993 | 3053 | NaN | 8044 |
Quadratic QF1 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | NaN | 3484 |
Quadratic QF2 | 9455 | 9202 | 9501 | 9016 | 9054 | 10229 | 10086 | NaN | 9531 | 9085 |
TRIDIA (CUTE) | NaN | NaN | NaN | 18640 | 18640 | 16792 | NaN | NaN | 20260 | 22051 |
Almost Perturbed Quadratic | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 |
LIARWHD (CUTE) | 7712 | 5931 | 8275 | 6165 | 8113 | 9395 | 5854 | 10305 | NaN | 4848 |
POWER (CUTE) | 23356 | 23356 | 23356 | 23347 | 23347 | 23416 | 23353 | 23350 | NaN | 23353 |
NONSCOMP (CUTE) | 31355 | 33211 | 27801 | 32705 | 31458 | 40807 | 34013 | 23411 | 13367 | 35106 |
QUARTC (CUTE) | 901 | 1254 | 1261 | 1224 | 1224 | 916 | 1254 | 1041 | 1347 | 1305 |
Diagonal 6 | 100 | 100 | 100 | 134 | 176 | 182 | 100 | 158 | NaN | 282 |
DIXON3DQ (CUTE) | 36508 | 15534 | 33759 | 15534 | 35926 | 42952 | 15534 | 51284 | NaN | 36796 |
BIGGSB1 (CUTE) | 31960 | 15534 | 31427 | 15534 | 33247 | 40846 | 15534 | 48620 | NaN | 33469 |
Generalized Quartic | 457 | 371 | 370 | 371 | 481 | 529 | 371 | 439 | 446 | 467 |
Diagonal 7 | 160 | NaN | 130 | NaN | 130 | 160 | NaN | 160 | 142 | 13 |
Diagonal 8 | 130 | 130 | 130 | 160 | NaN | 160 | 130 | NaN | NaN | 130 |
Full Hessian FH3 | 139 | 136 | 136 | 136 | 136 | 139 | 136 | 139 | NaN | NaN |
FLETCHCR (CUTE) | 166463 | 165774 | 168739 | 175309 | 175845 | 240240 | 184939 | NaN | 174406 | 215687 |
Figure 11 plots the performance profiles for the results in Table 8, in three columns denoted by FE.
From Figure 11, it is observable that the graph of the HCG6 method comes first to the top, which signifies that the HCG6 outperforms other considered methods with respect to the FE. However, if we look in more detail Figure 11, we can see an identical situation as in Figure 10 that the HCG6 method does not have the best results, it has the best results in only
Table 9 contains numerical results of the hybrid CG methods with respect to the CPU.
Test function | HCG1 | HCG2 | HCG3 | HCG4 | HCG5 | HCG6 | HCG7 | HCG8 | HCG9 | HCG10 |
Perturbed Quadratic | 0.656 | 0.516 | 0.781 | 0.719 | 0.594 | 0.438 | 0.719 | 0.688 | 0.844 | 0.688 |
Raydan 2 | 0.031 | 0.063 | 0.078 | 0.078 | 0.078 | 0.078 | 0.078 | 0.078 | NaN | 0.078 |
Diagonal 2 | 1.453 | 1.328 | 1.656 | 1.172 | 1.438 | 1.797 | 1.813 | 1.266 | 1.250 | 1.141 |
Extended Tridiagonal 1 | 1.016 | 1.125 | 1.359 | 2.250 | 2.375 | 7.578 | 1.672 | 1.375 | NaN | 0.922 |
Diagonal 4 | 0.031 | 0.031 | 0.031 | 0.078 | 0.078 | 0.047 | 0.109 | 0.094 | NaN | 0.094 |
Diagonal 5 | 0.141 | 0.063 | 0.156 | 0.094 | 0.094 | 0.125 | 0.109 | 0.078 | 0.219 | 0.156 |
Extended Himmelblau | 0.172 | 0.172 | 0.109 | 0.141 | 0.172 | 0.141 | 0.125 | 0.141 | 0.172 | 0.125 |
Full Hessian FH2 | 83.125 | 91.938 | 86.984 | 85.766 | 94.484 | 78.281 | 77.141 | 74.500 | 80.969 | 82.469 |
Perturbed quadratic diagonal | 0.406 | 0.609 | 0.641 | 0.375 | 0.563 | 0.359 | 0.328 | 0.344 | NaN | 0.734 |
Quadratic QF1 | 0.359 | 0.438 | 0.422 | 0.422 | 0.406 | 0.391 | 0.484 | 0.422 | NaN | 0.281 |
Quadratic QF2 | 1.047 | 1.313 | 1.203 | 1.156 | 1.063 | 1.156 | 1.000 | NaN | 1.094 | 1.047 |
TRIDIA (CUTE) | NaN | NaN | NaN | 1.688 | 1.391 | 1.859 | NaN | NaN | 1.875 | 1.391 |
Almost Perturbed Quadratic | 0.406 | 0.438 | 0.516 | 0.594 | 0.250 | 0.359 | 0.406 | 0.578 | 0.641 | 0.422 |
LIARWHD (CUTE) | 0.938 | 0.828 | 1.203 | 0.797 | 1.125 | 1.172 | 0.938 | 1.203 | NaN | 0.594 |
POWER (CUTE) | 1.563 | 1.672 | 1.750 | 1.609 | 1.625 | 1.578 | 1.625 | 1.188 | NaN | 1.453 |
NONSCOMP (CUTE) | 1.547 | 1.484 | 1.063 | 1.766 | 1.422 | 1.719 | 1.516 | 1.063 | 1.203 | 1.703 |
QUARTC (CUTE) | 0.750 | 1.000 | 0.969 | 0.969 | 0.875 | 0.797 | 0.938 | 0.703 | 1.266 | 0.93 |
Diagonal 6 | 0.078 | 0.078 | 0.078 | 0.094 | 0.063 | 0.016 | 0.016 | 0.125 | NaN | 0.109 |
DIXON3DQ (CUTE) | 2.047 | 1.453 | 2.016 | 1.484 | 2.359 | 2.234 | 1.406 | 2.297 | NaN | 2.078 |
BIGGSB1 (CUTE) | 1.875 | 2.047 | 2.359 | 1.750 | 2.250 | 2.391 | 1.422 | 2.672 | NaN | 2.422 |
Generalized Quartic | 0.063 | 0.125 | 0.141 | 0.156 | 0.125 | 0.094 | 0.078 | 0.109 | 0.172 | 0.109 |
Diagonal 7 | 0.063 | NaN | 0.016 | NaN | 0.109 | 0.063 | NaN | 0.063 | 0.063 | 0.063 |
Diagonal 8 | 0.078 | 0.125 | 0.078 | 0.031 | NaN | 0.063 | 0.109 | NaN | NaN | 0.078 |
Full Hessian FH3 | 0.063 | 0.047 | 0.109 | 0.047 | 0.031 | 0.063 | 0.047 | 0.109 | NaN | NaN |
FLETCHCR (CUTE) | 5.656 | 6.750 | 7.922 | 9.484 | 6.484 | 8.766 | 7.281 | NaN | 6.906 | 7.547 |
Figure 12 plots the performance profiles for the results in Tables 9.
From Figure 12, it is observable that the graph of the HCG6 method comes first to the top, which signifies that the HCG6 outperforms other considered methods with respect to the CPU. However, if we look in more detail Figure 12, we can see an identical situation as in the figures 10 and 11 that the HCG6 method does not have the best results, it has the best results in only
The numerical experiments presented in this subsection investigate the influence of the scalar size in the modified Dai-Liao methods. The previously mentioned variants of the DL method use a fixed value of the parameter
That is why we come to the next question. What is the 'best' value of
As we have already indicated in the previous section, the aim of this subsection is to answer to the question: what is the 'best' value of
- Does and how much the values of the scalar
- Does the choice of
To give an answer to these questions, we would have to test all the methods under the same conditions. During testing, we will compare all the considered methods with the same values of required scalars. The Algorithm 2, i.e. the backtracking line search, determines the step-size
Algorithm 6 gives the corresponding general framework for DHSDL, DLSDL, MHSDL and MLSDL methods.
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Values of
Label | T1 | T2 | T3 | T4 | T5 | T6 |
Value of the scalar |
0.05 | 0.1 | 0.2 | 0.5 | 0.9 |
The reason why we decided to define values
-
-
The convergence of the above methods has already considered in the mentioned references. Our goal is to give a unified analysis of the convergence of proposed DL methods. The aim of our research is to investigate the influence of the scalar
Each particular value of the scalar
The convergence of the above methods has already considered in the mentioned references. Our goal is to give a unified analysis of the convergence of the proposed DL methods. The aim of our research is to investigate the influence of the scalar
Arranged numerical results are generated by testing and comparing DHSDL, DLSDL, MHSDL and MLSDL methods on
The uniform stopping criteria are (refer to previous)
\|\mathbf{g}_k\| \leq 10^{-6} \ \ \ \mathrm{and}\ \ \ \frac{|f_{k+1}-f_k|}{1+|f_k|} \leq 10^{-16}. |
The backtracking parameters for all algorithms are
During the testing of the DHSDL and DLSDL methods, the constant parameter
Figure 13 indicates the performance profiles of the IT criterion with respect to the DHSDL method depending on the scalar value
Figure 14 shows the performance profile given by FE of the DHSDL solver with respect to
Figure 15 illustrates the CPU criterion spanned by the DHSDL method depending on the metrics
The graphs displayed in figures 13–15 show that the DHSDL method has achieved superior results for
Figure 16 illustrates the IT criterion in the DLSDL method depending on the scalar value
Figure 17 shows the performance profiles of the criterion FE corresponding to the DLSDL method and the scalar value
Figure 18 shows the performance profiles of the CPU time of the DLSDL method depending on
Based on figures (16–18) analysis, we come to the conclusion that the DLSDL method has achieved best responses for
In Subsection 6.5.1, we have defined two questions. Our aim in this subsection is to give answers. In order to answer properly to the first question, in Tables 11, 12, 13 are collected average values for all three considered criteria.
Method | T1 | T2 | T3 | T4 | T5 | T6 |
DHSDL | 32980.14 | 31281.32 | 33640.45 | 32942.36 | 34448.32 | 33872.36 |
DLSDL | 30694.00 | 28701.14 | 31048.32 | 30594.77 | 31926.59 | 31573.05 |
MHSDL | 29289.73 | 27653.64 | 29660.00 | 29713.50 | 30491.18 | 30197.27 |
MLSDL | 25398.82 | 22941.77 | 24758.27 | 24250.68 | 25722.64 | 25032.64 |
Method | T1 | T2 | T3 | T4 | T5 | T6 |
DHSDL | 1228585.50 | 1191960.55 | 1252957.09 | 1238044.36 | 1271176.59 | 1255710.45 |
DLSDL | 1131421.41 | 1083535.14 | 1149482.41 | 1134315.00 | 1167030.14 | 1158554.77 |
MHSDL | 1089700.41 | 1036710.32 | 1089777.64 | 1091985.41 | 1105299.91 | 1101380.18 |
MLSDL | 904217.14 | 845017.55 | 891669.50 | 879473.14 | 913165.68 | 895652.36 |
Method | T1 | T2 | T3 | T4 | T5 | T6 |
DHSDL | 902.06 | 894.73 | 917.77 | 930.56 | 911.28 | 870.93 |
DLSDL | 816.08 | 790.63 | 804.69 | 816.28 | 803.84 | 809.67 |
MHSDL | 770.78 | 751.65 | 728.61 | 749.70 | 712.64 | 720.57 |
MLSDL | 573.14 | 587.41 | 581.50 | 576.32 | 582.62 | 580.96 |
After the numerical testing of the compared methods and the individual analysis for each method, we can now give a global conclusion of the behavior of the observed methods. The first conclusion is that the value of the scalar
Overview of QN methods, CG methods and their classification are presented. Section 5 investigates convergence properties of CG methods, following the CG classes in accordance with the presented taxonomy of basic CG methods. Numerical experiments compare main classes of QN and CG methods. More precisely, main QN methods with constant diagonal Hessian approximation are compared as well as two classes of basic CG methods, hybrid CG methods, and finally some variants of modified Dai-Liao methods.
The problem of defining further improvements of the CGUP
One of prospective fields for further research includes a generalization of the discrete-time approach to continuous-time approach, considered in [69]. Another possibility for further research is extension of gradient methods to tensor case. This possibility was exploited in [77] on solving
Moreover, an application of low rank updates used in optimization can be transferred to appropriate numerical methods for computing generalized inverses. Applications of rank-one update formula are investigated in [115,116].
Predrag Stanimirović and Dijana Mosić are supported by the Ministry of Education, Science and Technological Development, Grants No. 174013 and 174007. Haifeng Ma is supported by the National Natural Science Foundation of China under grant 11971136 and the bilateral project between China and and Poland (no. 37-18).
[1] |
Najihah TS, Ibrahim M H, Zain NAM, et al. (2020) Activity of the oil palm seedlings exposed to a different rate of potassium fertilizer under water stress condition. AIMS Environ Sci 7: 46-68. doi: 10.3934/environsci.2020004
![]() |
[2] |
Sönmez FK, Kömüscü AÜ, Erkan A, et al. (2005) An analysis of spatial and temporal dimension of drought vulnerability in Turkey using the standardized precipitation index. Nat Hazards 35: 243-264. doi: 10.1007/s11069-004-5704-7
![]() |
[3] |
Manjowe M, Mushore TD, Gwenzi J, et al. (2018) Circulation mechanisms responsible for wet or dry summers over Zimbabwe. AIMS Environ Sci 5: 154-172. doi: 10.3934/environsci.2018.3.154
![]() |
[4] |
Smakhtin VU, Hughes DA, (2007) Automated estimation and analyses of meteorological drought characteristics from monthly rainfall data. Environ Modell Softw 22: 880-890. doi: 10.1016/j.envsoft.2006.05.013
![]() |
[5] |
Christy JR (2019) Examination of extreme rainfall events in two regions of the United States since the 19th century. AIMS Environ Sci 6: 109-126. doi: 10.3934/environsci.2019.2.109
![]() |
[6] | Yahaya I, Adamu SJ, Muhammed BB, (2017) The use of Standardized Precipitation Index (SPI) for Drought Intensity Assessment in North-Eastern Nigeria. Researchjournal's J Geo 4:1-13 |
[7] |
Feyissa G, Zeleke G, Bewket W, et al. (2018) Downscaling of future temperature and precipitation extremes in Addis Ababa under climate change. Climate 6: 58. doi: 10.3390/cli6030058
![]() |
[8] | Engida M (1999) Annual rainfall and evapotranspiration in Ethiopia. Ethiopian Journal of Natural Resources 1: 137-154. |
[9] | IPCC (2007) Climate Change 2007: Impacts, Adaptation and Vulnerability. Contribution of Working Group Ⅱ to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change. Edited by Parry, M., Canziani, O., Palutikof, J., Linden, P.vd., Hanson, C., Cambridge University Press 32 Avenue of the Americas, New York, USA, 10013-2473. |
[10] | Alemu ZA, Dioha MO, (2020a) Climate change and trend analysis of temperature: the case of Addis Ababa, Ethiopia. Environ Syst Res 9: 1-15. |
[11] | FDRE, Ethiopian Government Portal, 2018. Available from: http://www.ethiopia.gov.et/addis-ababa-city-administration. |
[12] | Niemeyer S, New drought indices. European Commission, DG Joint Research Centre, Institute for Environment and Sustainability, T.P. 261, JRC-IES, I-21020 Ispra (VA), Italy, 2020 Availablefrom: https://om.ciheam.org/om/pdf/a80/00800451.pdf |
[13] | Gebreyesus M, Cherint A, Ashine T, et al. Drought Analysis Using Reconnaissance Drought Index (RDI): In the case of Awash River Basin, Ethiopia2020. Available from: https://www.researchgate.net/publication/344440865_Drought_analysis_using_reconnaissance_drought_index_RDI_In_the_case_of_Awash_River_Basin_Ethiopia |
[14] | Tigkas D, Vangelis H, Tsakiris G (2013) The RDI as a composite climatic index. E.W. Publications. European Water 41: 17-22. |
[15] | McKee TB, Doesken NJ, Kleist J (1993) The relationship of drought frequency and duration of time scales. Eighth Conference on Applied Climatology, American Meteorological Society, Anaheim CA. |
[16] | Aksoy H, Onoz B, Cetin M, et al. (2018) SPI-based Drought Severity-Duration-Frequency Analysis. 13th International Congress on Advances in Civil Engineering, Izmir/Turkey. |
[17] | Pashiardis S, Michaelides S (2008) Implementation of the Standardized Precipitation Index (SPI) and the Reconnaissance Drought Index (RDI) for Regional Drought Assessment: A case study for Cyprus. E.W. Publications. Eur Water 23/24: 57-65. |
[18] |
Asadi-Zarch MA, Malekinezhad H, Mobin MH, et al. (2011) Drought Monitoring by Reconnaissance Drought Index (RDI) in Iran. Water Resour Manag 25: 3485-3504. doi: 10.1007/s11269-011-9867-1
![]() |
[19] | Ansarifard S, Shamsnia SA (2018) Monitoring drought by Reconnaissance Drought Index (RDI) and Standardized Precipitation Index (SPI) using DrinC software. Water Utility J 20: 29-35. |
[20] |
Hayes MJ, Svoboda MD, Wilhite DA, et al. (1999) Monitoring the 1996 Drought Using the Standardized Precipitation Index. B Am Meteorol Soc 80: 429-438. doi: 10.1175/1520-0477(1999)080<0429:MTDUTS>2.0.CO;2
![]() |
[21] |
Zarei AR, Moghimi MM, Mahmoudi MR (2016) Analysis of Changes in Spatial Pattern of Drought Using RDI Index in south of Iran. Water Resour Manag 30: 3723-3743. doi: 10.1007/s11269-016-1380-0
![]() |
[22] | Pramudya Y, Onishi T, (2018) Assessment of the Standardized Precipitation Index (SPI) in Tegal City, Central Java, Indonesia. IOP Conf. Series: Earth Environ Sci 129: 012-019. |
[23] | Maroua BA, Nouiri I, (2018) Study of trends in historical variation and mapping of drought events in Tunisia: A global assessment of Standardized precipitation index (SPI) and Reconnaissance drought index (RDI) for the period 1973-2016. 3rd International Conference on Integrated Environmental Management for Sustainable Development. ISSN 1737-3638. |
[24] |
Vangelis H, Tigkas D, Tsakiris G, (2012) The effect of PET method on Reconnaissance Drought Index (RDI) calculation. J Arid Environ 88: 130-140. doi: 10.1016/j.jaridenv.2012.07.020
![]() |
[25] |
Tsakiris G, Vangelis H, Pangalou D, (2007) Regional Drought Assessment Based on the Reconnaissance Drought Index (RDI). Water Resour Manage 21: 821-833. doi: 10.1007/s11269-006-9105-4
![]() |
[26] | Fitsume Y, (2014) Precipitation Extremes and their Pattern in the Central Highlands of Ethiopia: SPI Based Analysis. J Nat Sci Res 4: 92-97. |
[27] |
Gidey E, Dikinya O, Sebego R, et al. (2018) Modeling the Spatio-Temporal Meteorological Drought Characteristics Using the Standardized Precipitation Index (SPI) in Raya and Its Environs, Northern Ethiopia. Earth Sys Environ 2: 281-292. doi: 10.1007/s41748-018-0057-7
![]() |
[28] |
Spinoni J, Naumann G, Carrao H, et al. (2013) World drought frequency, duration, and severity for 1951-2010. Int J Climatol 34: 2792-2804. doi: 10.1002/joc.3875
![]() |
[29] |
Tigkas D, Vangelis H, Tsakiris G (2015) DrinC: a software for drought analysis based on drought indices. Earth Sci Inform 8: 697-709. doi: 10.1007/s12145-014-0178-y
![]() |
[30] | Climate-Ethiopia, Climates to travel world climate guide, 2021. Available from: https://www.climatestotravel.com/climate/ethiopia. |
[31] | Alemu ZA, Dioha MO (2020b) Modelling scenarios for sustainable water supply and demand in Addis Ababa city, Ethiopia. Environ Syst Res 9: 1-14. |
[32] | FDRE, City Map of Addis Ababa City Administration, Ethiopia, 2020. Available from: http://www.addisababa.gov.et/de/web/guest/city-map |
[33] | Rossi G, Bonaccorso B, Vega T (2007) Methods and tools for drought analysis and management, Springer Science and Business Media, Berlin. vol 62. ISBN 978-1-4020-5923-0 |
[34] | Tsakiris G, Nalbantis I, Pangalou D, et al. (2008) Drought meteorological monitoring network design for the reconnaissance drought index (RDI). In: Franco Lopez A. (Ed.), Proceedings of the 1st International Conference "Drought Management: scientific and technological innovations". Zaragoza, Spain: Option Méditerranéennes, Series A, 80: 57-62. |
[35] |
Vangelis H, Tigkas D, Tsakiris G (2013) The effect of PET method on Reconnaissance Drought Index (RDI) calculation. J Arid Environ 88: 130-140. doi: 10.1016/j.jaridenv.2012.07.020
![]() |
[36] | Allen RG, Pereira LS, Raes D, et al. (1998) Crop evapotranspiration: guidelines for computing crop water requirements. FAO irrigation and drainage paper 56, 1st edition. Rome, Italy. |
[37] |
Hargreaves GH, Samani ZA (1985) Reference crop evapotranspiration from temperature. Appl Eng Agric 1: 96-99. doi: 10.13031/2013.26773
![]() |
[38] | Tigkas D (2008) Drought Characterization and Monitoring in Regions of Greece. Eur Water 23/24: 29-39. |
[39] |
Khanmohammadi N, Rezaie H, Montaseri M, et al. (2017) The Effect of Temperature Adjustment on Reference Evapotranspiration and Reconnaissance Drought Index (RDI) in Iran. Water Resour Manag 31: 5001-5017. doi: 10.1007/s11269-017-1793-4
![]() |
[40] | Gupta SP (2007) Statistical Methods. Seventh Revised and Enlarged Edition ed. Sultan Chand and Sons, Educational Publisher. New Delhi. |
[41] | Helsel DR, Hirsch RM (2002) Statistical methods in water resources. Techniques of water-resources investigations of the United States geological survey, book 4, hydrologic analysis and interpretation. U. S. Geological survey |
[42] | Philip S, Kew SF, Oldenborgh GJ, et al. (2018) Attribution Analysis of the Ethiopian Drought of 2015. American Meteorological Society 2465-2486. |
[43] | Jamshidi H, Khalili D, Zadeh MR, et al. (2011) Assessment and comparison of SPI and RDI meteorological drought indices in selected synoptic stations of Iran. In World Environmental and Water Resources Congress 2011: Bearing Knowledge for Sustainability, 1161-1173. |
[44] |
Haied N, Foufou A, Chaab S, et al. (2017) Drought assessment and monitoring using meteorological indices in a semi-arid region. Energy Procedia 119: 518-529. doi: 10.1016/j.egypro.2017.07.064
![]() |
[45] |
Shah R, Bharadiya N, Manekar V (2015) Drought index computation using Standardized Precipitation Index (SPI) method for Surat district, Gujarat. J Aquat Proced 4: 1243-1249. doi: 10.1016/j.aqpro.2015.02.162
![]() |
[46] |
Thilakarathne M, Sridhar V (2017) Characterization of future drought conditions in the Lower Mekong River Basin. Weather Climate Extremes 17: 47-58. doi: 10.1016/j.wace.2017.07.004
![]() |
[47] |
Khatiwada KR, Pandey VP (2019) Characterization of hydro-meteorological drought in Nepal Himalaya: A case of Karnali River Basin. Weather Climate Extremes 26: 100239. doi: 10.1016/j.wace.2019.100239
![]() |
[48] |
Abdelmalek MB, Nouiri I (2020) Study of trends and mapping of drought events in Tunisia and their impacts on agricultural production. Sci Total Environ 734: 139311. doi: 10.1016/j.scitotenv.2020.139311
![]() |
[49] | Almedeij J, (2014) Drought analysis for kuwait using standardized precipitation index. The Sc World J. 2014 |
[50] | NMSA, (1996) Climatic and agro climatic resources of Ethiopia. National Meteorology Service Agency of Ethiopia, Addis Ababa. 1: 137. |
[51] |
Temam D, Uddameri V, Mohammadi G, et al. (2019) Long-Term Drought Trends in Ethiopia with Implications for Dryland Agriculture. MDPI-Water 11: 2571. doi: 10.3390/w11122571
![]() |
[52] |
Cermak V, Bodri L, Safanda J, et al. (2019) Variability trends in the daily air temperatures series Running head: Variability trends prague. AIMS Environ Sci 6: 167-185. doi: 10.3934/environsci.2019.3.167
![]() |
1. | Yiyuan Qian, Kai Zhang, Jingzhi Li, Xiaoshen Wang, Adaptive neural network surrogate model for solving the implied volatility of time-dependent American option via Bayesian inference, 2022, 30, 2688-1594, 2335, 10.3934/era.2022119 | |
2. | Branislav Ivanov, Gradimir V. Milovanović, Predrag S. Stanimirović, Accelerated Dai-Liao projection method for solving systems of monotone nonlinear equations with application to image deblurring, 2023, 85, 0925-5001, 377, 10.1007/s10898-022-01213-4 | |
3. | Krzysztof Gdawiec, Wiesław Kotarski, Agnieszka Lisowska, On the robust Newton’s method with the Mann iteration and the artistic patterns from its dynamics, 2021, 104, 0924-090X, 297, 10.1007/s11071-021-06306-5 | |
4. | Mohd Rizman Sultan Mohd, Juliana Johari, Fazlina Ahmat Ruslan, Noorfadzli Abdul Razak, Salmiah Ahmad, Ahmad Syahiman Mohd Shah, 2022, Chapter 47, 978-981-16-8514-9, 619, 10.1007/978-981-16-8515-6_47 | |
5. | Jingxin Lin, Nianfeng Wang, Hao Zhu, Xianmin Zhang, Xuewei Zheng, 2021, Fabric defect detection based on multi-input neural network, 978-1-6654-3153-8, 458, 10.1109/M2VIP49856.2021.9665111 | |
6. | Yapeng Li, Xiangzhen Wang, Wenjie Cheng, Songyang Gao, Chuntian Cheng, A combination approach for downstream plants to solve scheduling information asymmetry problem in electricity markets, 2023, 149, 01420615, 108935, 10.1016/j.ijepes.2022.108935 | |
7. | Mei Liu, Xiaoyan Zhang, Mingsheng Shang, Computational Neural Dynamics Model for Time-Variant Constrained Nonlinear Optimization Applied to Winner-Take-All Operation, 2022, 18, 1551-3203, 5936, 10.1109/TII.2021.3138794 | |
8. | Huiping Cao, Xiaomin An, Jing Han, Solving nonlinear equations with a direct Broyden method and its acceleration, 2022, 1598-5865, 10.1007/s12190-022-01818-8 | |
9. | Tina Mai, Daniele Mortari, Theory of functional connections applied to quadratic and nonlinear programming under equality constraints, 2022, 406, 03770427, 113912, 10.1016/j.cam.2021.113912 | |
10. | Saman Babaie-Kafaki, A survey on the Dai–Liao family of nonlinear conjugate gradient methods, 2023, 57, 0399-0559, 43, 10.1051/ro/2022213 | |
11. | Branislav Ivanov, Gradimir V. Milovanović, Predrag S. Stanimirović, Aliyu Muhammed Awwal, Lev A. Kazakovtsev, Vladimir N. Krutikov, Xian-Ming Gu, A Modified Dai–Liao Conjugate Gradient Method Based on a Scalar Matrix Approximation of Hessian and Its Application, 2023, 2023, 2314-4785, 1, 10.1155/2023/9945581 | |
12. | Sanaz Bojari, Mahmoud Paripour, Xian-Ming Gu, Solving Large-Scale Unconstrained Optimization Problems with an Efficient Conjugate Gradient Class, 2024, 2024, 2314-4785, 1, 10.1155/2024/5548724 | |
13. | Diego Armando Perez-Rosero, Andrés Marino Álvarez-Meza, Cesar German Castellanos-Dominguez, A Regularized Physics-Informed Neural Network to Support Data-Driven Nonlinear Constrained Optimization, 2024, 13, 2073-431X, 176, 10.3390/computers13070176 | |
14. | Naleli Jubert Matjelo, Sekhants'o Paul Lara, Moruti Kao, Limakatso Lepekola, Mampesi Thato Matobako, Madan Singh, 2024, Double Decay Parameter Estimation with Autoregressive and Differential Linear Regression, 979-8-3503-9591-4, 1, 10.1109/ICECET61485.2024.10698202 | |
15. | O. B. Onuoha, R. O. Ayinla, G. Egenti, O. E. Taiwo, Exploring Enhanced Conjugate Gradient Methods: A Novel Family of Techniques for Efficient Unconstrained Minimization, 2024, 2581-8147, 773, 10.34198/ejms.14424.773791 | |
16. | Long Jin, Lin Wei, Xin Lv, 2025, Chapter 4, 978-3-031-68593-4, 99, 10.1007/978-3-031-68594-1_4 | |
17. | Nur Idalisa, Norhaslinda Zullpakkal, Mohd Rivaie, Nurul ’Aini, Nurul Hajar, Wan Khadijah, Nurul Hafawati Fadhilah, 2024, 3080, 0094-243X, 030012, 10.1063/5.0192308 | |
18. | Elena Tovbis, Vladimir Krutikov, Predrag Stanimirović, Vladimir Meshechkin, Aleksey Popov, Lev Kazakovtsev, A Family of Multi-Step Subgradient Minimization Methods, 2023, 11, 2227-7390, 2264, 10.3390/math11102264 | |
19. | João Miguel Ferreira, Optimal control policies for a non-eruptive population of rodents—The relevance of migration, 2023, 484, 03043800, 110477, 10.1016/j.ecolmodel.2023.110477 | |
20. | Peng Yu, Shuping Tan, Jin Guo, Yong Song, Data-driven optimal controller design for sub-satellite deployment of tethered satellite system, 2024, 32, 2688-1594, 505, 10.3934/era.2024025 | |
21. | Jamilu Sabi’u, Sekson Sirisubtawee, An inertial Dai-Liao conjugate method for convex constrained monotone equations that avoids the direction of maximum magnification, 2024, 70, 1598-5865, 4319, 10.1007/s12190-024-02123-2 | |
22. | Chunming Tang, Wancheng Tan, Yongshen Zhang, Zhixian Liu, An accelerated spectral CG based algorithm for optimization techniques on Riemannian manifolds and its comparative evaluation, 2025, 462, 03770427, 116482, 10.1016/j.cam.2024.116482 |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Quasi-Newton Eqs. | Ref. | |
[104] | ||
[72] | ||
[123] | ||
[133] | ||
[134] | ||
[59] |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Title | Year | Reference | |
1952 | [60] | ||
1964 | [48] | ||
1967 | [38] | ||
1969 | [102,103] | ||
1987 | [47] | ||
1991 | [79] | ||
1999 | [30] |
Denominator | |||
Numerator | |||
FR | DY | CD | |
PRP | HS | LS |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
IT profile | FE profile | CPU time | |||||||
Test function | AGD | MSM | SM | AGD | MSM | SM | AGD | MSM | SM |
Perturbed Quadratic | 353897 | 34828 | 59908 | 13916515 | 200106 | 337910 | 6756.047 | 116.281 | 185.641 |
Raydan 1 | 22620 | 26046 | 14918 | 431804 | 311260 | 81412 | 158.359 | 31.906 | 36.078 |
Diagonal 3 | 120416 | 7030 | 12827 | 4264718 | 38158 | 69906 | 5527.844 | 52.609 | 102.875 |
Generalized Tridiagonal 1 | 670 | 346 | 325 | 9334 | 1191 | 1094 | 11.344 | 1.469 | 1.203 |
Extended Tridiagonal 1 | 3564 | 1370 | 4206 | 14292 | 10989 | 35621 | 55.891 | 29.047 | 90.281 |
Extended TET | 443 | 156 | 156 | 3794 | 528 | 528 | 3.219 | 0.516 | 0.594 |
Diagonal 4 | 120 | 96 | 96 | 1332 | 636 | 636 | 0.781 | 0.203 | 0.141 |
Extended Himmelblau | 396 | 260 | 196 | 6897 | 976 | 668 | 1.953 | 0.297 | 0.188 |
Perturbed quadratic diagonal | 2542050 | 37454 | 44903 | 94921578 | 341299 | 460028 | 44978.750 | 139.625 | 185.266 |
Quadratic QF1 | 366183 | 36169 | 62927 | 13310016 | 208286 | 352975 | 12602.563 | 81.531 | 138.172 |
Extended quadratic penalty QP1 | 210 | 369 | 271 | 2613 | 2196 | 2326 | 1.266 | 1.000 | 0.797 |
Extended quadratic penalty QP2 | 395887 | 1674 | 3489 | 9852040 | 11491 | 25905 | 3558.734 | 3.516 | 6.547 |
Quadratic QF2 | 100286 | 32727 | 64076 | 3989239 | 183142 | 353935 | 1582.766 | 73.438 | 132.703 |
Extended quadratic exponential EP1 | 48 | 100 | 73 | 990 | 894 | 661 | 0.750 | 0.688 | 0.438 |
Extended Tridiagonal 2 | 1657 | 659 | 543 | 8166 | 2866 | 2728 | 3.719 | 1.047 | 1.031 |
ARWHEAD (CUTE) | 5667 | 430 | 270 | 214284 | 5322 | 3919 | 95.641 | 1.969 | 1.359 |
Almost Perturbed Quadratic | 356094 | 33652 | 60789 | 14003318 | 194876 | 338797 | 13337.125 | 73.047 | 133.516 |
LIARWHD (CUTE) | 1054019 | 3029 | 18691 | 47476667 | 27974 | 180457 | 27221.516 | 9.250 | 82.016 |
ENGVAL1 (CUTE) | 743 | 461 | 375 | 6882 | 2285 | 2702 | 3.906 | 1.047 | 1.188 |
QUARTC (CUTE) | 171 | 217 | 290 | 402 | 494 | 640 | 2.469 | 1.844 | 2.313 |
Generalized Quartic | 187 | 181 | 189 | 849 | 493 | 507 | 0.797 | 0.281 | 0.188 |
Diagonal 7 | 72 | 147 | 108 | 333 | 504 | 335 | 0.625 | 0.547 | 0.375 |
Diagonal 8 | 60 | 120 | 118 | 304 | 383 | 711 | 0.438 | 0.469 | 0.797 |
Full Hessian FH3 | 45 | 63 | 63 | 1352 | 566 | 631 | 1.438 | 0.391 | 0.391 |
Diagonal 9 | 329768 | 10540 | 13619 | 13144711 | 68189 | 89287 | 6353.172 | 43.609 | 38.672 |
IT profile | FE profile | CPU time | |||||||
Test function | HS | PRP | LS | HS | PRP | LS | HS | PRP | LS |
Perturbed Quadratic | 1157 | 1157 | 6662 | 3481 | 3481 | 19996 | 0.234 | 0.719 | 1.438 |
Raydan 2 | NaN | 174 | 40 | NaN | 373 | 120 | NaN | 0.094 | 0.078 |
Diagonal 2 | NaN | 1721 | 5007 | NaN | 6594 | 15498 | NaN | 1.313 | 2.891 |
Extended Tridiagonal 1 | NaN | 170 | 17079 | NaN | 560 | 54812 | NaN | 0.422 | 13.641 |
Diagonal 4 | NaN | 70 | 1927 | NaN | 180 | 5739 | NaN | 0.078 | 0.391 |
Diagonal 5 | NaN | 154 | 30 | NaN | 338 | 90 | NaN | 0.172 | 0.078 |
Extended Himmelblau | 160 | 120 | 241 | 820 | 600 | 1043 | 0.172 | 0.125 | 0.172 |
Full Hessian FH2 | 5096 | 5686 | 348414 | 15294 | 17065 | 1045123 | 83.891 | 80.625 | 5081.875 |
Perturbed quadratic diagonal | 1472 | 1120 | 21667 | 4419 | 3363 | 65057 | 0.438 | 0.391 | 2.547 |
Quadratic QF1 | 1158 | 1158 | 5612 | 3484 | 3484 | 16813 | 0.281 | 0.313 | 1.047 |
Extended quadratic penalty QP2 | NaN | 533 | NaN | NaN | 5395 | NaN | NaN | 0.781 | NaN |
Quadratic QF2 | 2056 | 2311 | NaN | 9168 | 9862 | NaN | 0.969 | 0.859 | NaN |
Extended quadratic exponential EP1 | NaN | NaN | 70 | NaN | NaN | 350 | NaN | NaN | 0.141 |
TRIDIA (CUTE) | 6835 | 6744 | NaN | 20521 | 20248 | NaN | 1.438 | 1.094 | NaN |
Almost Perturbed Quadratic | 1158 | 1158 | 5996 | 3484 | 3484 | 17998 | 0.281 | 0.328 | 1.063 |
LIARWHD (CUTE) | NaN | 408 | 11498 | NaN | 4571 | 50814 | NaN | 0.438 | 2.969 |
POWER (CUTE) | 7781 | 7789 | 190882 | 23353 | 23377 | 572656 | 1.422 | 1.219 | 14.609 |
NONSCOMP (CUTE) | 4545 | 3647 | NaN | 15128 | 12433 | NaN | 0.875 | 0.656 | NaN |
QUARTC (CUTE) | NaN | 165 | 155 | NaN | 1347 | 1466 | NaN | 0.781 | 0.766 |
Diagonal 6 | NaN | 174 | 137 | NaN | 373 | 442 | NaN | 0.109 | 0.125 |
DIXON3DQ (CUTE) | NaN | 12595 | 12039 | NaN | 37714 | 36091 | NaN | 1.641 | 2.859 |
BIGGSB1 (CUTE) | NaN | 11454 | 11517 | NaN | 34293 | 34530 | NaN | 1.969 | 2.141 |
Generalized Quartic | NaN | 134 | 139 | NaN | 458 | 445 | NaN | 0.125 | 0.094 |
Diagonal 7 | NaN | 51 | 80 | NaN | 142 | 240 | NaN | 0.063 | 0.109 |
Diagonal 8 | NaN | 70 | 80 | NaN | 180 | 180 | NaN | 0.063 | 0.125 |
FLETCHCR (CUTE) | 18292 | 19084 | 20354 | 178305 | 170266 | 171992 | 8.859 | 6.203 | 7.484 |
IT profile | FE profile | CPU time | |||||||
Test function | DY | FR | CD | DY | FR | CD | DY | FR | CD |
Perturbed Quadratic | 1157 | 1157 | 1157 | 3481 | 3481 | 3481 | 0.469 | 0.609 | 0.531 |
Raydan 2 | 86 | 40 | 40 | 192 | 100 | 100 | 0.063 | 0.016 | 0.016 |
Diagonal 2 | 1636 | 3440 | 2058 | 4774 | 7982 | 8063 | 0.922 | 1.563 | 1.297 |
Extended Tridiagonal 1 | 2081 | 690 | 1140 | 4639 | 2022 | 2984 | 1.703 | 1.141 | 1.578 |
Diagonal 4 | 70 | 70 | 70 | 200 | 200 | 200 | 0.047 | 0.031 | 0.016 |
Diagonal 5 | 40 | 124 | 155 | 100 | 258 | 320 | 0.109 | 0.141 | 0.125 |
Extended Himmelblau | 383 | 339 | 207 | 1669 | 1467 | 961 | 0.219 | 0.172 | 0.172 |
Full Hessian FH2 | 4682 | 4868 | 4794 | 14054 | 14610 | 14390 | 65.938 | 66.469 | 65.922 |
Perturbed quadratic diagonal | 1036 | 1084 | 1276 | 3114 | 3258 | 3834 | 0.406 | 0.422 | 0.422 |
Quadratic QF1 | 1158 | 1158 | 1158 | 3484 | 3484 | 3484 | 0.297 | 0.297 | 0.328 |
Quadratic QF2 | NaN | NaN | 2349 | NaN | NaN | 10073 | NaN | NaN | 1.531 |
Extended quadratic exponential EP1 | NaN | 60 | 60 | NaN | 310 | 310 | NaN | 0.109 | 0.125 |
Almost Perturbed Quadratic | 1158 | 1158 | 1158 | 3484 | 3484 | 3484 | 0.422 | 0.453 | 0.391 |
LIARWHD (CUTE) | 2812 | 1202 | 1255 | 12366 | 7834 | 7379 | 0.938 | 1.000 | 1.109 |
POWER (CUTE) | 7779 | 7781 | 7782 | 23347 | 23353 | 23356 | 1.078 | 1.500 | 1.328 |
NONSCOMP (CUTE) | 2558 | 13483 | 10901 | 49960 | 43268 | 33413 | 1.203 | 1.406 | 1.422 |
QUARTC (CUTE) | 134 | 94 | 95 | 1132 | 901 | 916 | 0.688 | 0.672 | 0.563 |
Diagonal 6 | 86 | 40 | 40 | 192 | 100 | 100 | 0.047 | 0.063 | 0.063 |
DIXON3DQ (CUTE) | 16047 | 18776 | 19376 | 48172 | 56369 | 58176 | 2.266 | 2.516 | 2.734 |
BIGGSB1 (CUTE) | 15274 | 17835 | 18374 | 45853 | 53546 | 55170 | 2.875 | 2.922 | 2.484 |
Generalized Quartic | 142 | 214 | 173 | 497 | 712 | 589 | 0.078 | 0.172 | 0.109 |
Diagonal 7 | 50 | 50 | 50 | 160 | 160 | 160 | 0.063 | 0.047 | 0.094 |
Diagonal 8 | 50 | 40 | 40 | 160 | 130 | 130 | 0.109 | 0.125 | 0.063 |
Full Hessian FH3 | 43 | 43 | 43 | 139 | 139 | 139 | 0.063 | 0.109 | 0.109 |
FLETCHCR (CUTE) | NaN | NaN | 26793 | NaN | NaN | 240237 | NaN | NaN | 10.203 |
Test function | HCG1 | HCG2 | HCG3 | HCG4 | HCG5 | HCG6 | HCG7 | HCG8 | HCG9 | HCG10 |
Perturbed Quadratic | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 |
Raydan 2 | 40 | 40 | 40 | 57 | 78 | 81 | 40 | 69 | NaN | 126 |
Diagonal 2 | 1584 | 1581 | 1542 | 1488 | 1500 | 2110 | 2193 | 1843 | 1475 | 1453 |
Extended Tridiagonal 1 | 805 | 623 | 754 | 2110 | 2160 | 10129 | 1167 | 966 | NaN | 270 |
Diagonal 4 | 60 | 60 | 70 | 60 | 70 | 70 | 60 | 70 | NaN | 113 |
Diagonal 5 | 124 | 39 | 98 | 39 | 120 | 109 | 39 | 141 | 154 | 130 |
Extended Himmelblau | 145 | 139 | 111 | 161 | 181 | 207 | 159 | 381 | 109 | 108 |
Full Hessian FH2 | 5036 | 5036 | 5036 | 4820 | 4820 | 4800 | 4994 | 4789 | 5163 | 5705 |
Perturbed quadratic diagonal | 1228 | 1214 | 1266 | 934 | 1093 | 987 | 996 | 1016 | NaN | 2679 |
Quadratic QF1 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | NaN | 1158 |
Quadratic QF2 | 2125 | 2098 | 2174 | 1995 | 1991 | 2425 | 2378 | NaN | 2204 | 2034 |
TRIDIA (CUTE) | NaN | NaN | NaN | 6210 | 6210 | 5594 | NaN | NaN | 6748 | 7345 |
Almost Perturbed Quadratic | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 |
LIARWHD (CUTE) | 1367 | 817 | 1592 | 1024 | 1831 | 1774 | 531 | 2152 | NaN | 573 |
POWER (CUTE) | 7782 | 7782 | 7782 | 7779 | 7779 | 7802 | 7781 | 7780 | NaN | 7781 |
NONSCOMP (CUTE) | 10092 | 10746 | 8896 | 10466 | 9972 | 13390 | 11029 | 3520 | 3988 | 11411 |
QUARTC (CUTE) | 94 | 160 | 145 | 150 | 126 | 95 | 160 | 114 | 165 | 154 |
Diagonal 6 | 40 | 40 | 40 | 57 | 78 | 81 | 40 | 69 | NaN | 126 |
DIXON3DQ (CUTE) | 12182 | 5160 | 11257 | 5160 | 11977 | 14302 | 5160 | 17080 | NaN | 12264 |
BIGGSB1 (CUTE) | 10664 | 5160 | 10479 | 5160 | 11082 | 13600 | 5160 | 16192 | NaN | 11151 |
Generalized Quartic | 129 | 107 | 110 | 107 | 142 | 153 | 107 | 123 | 131 | 145 |
Diagonal 7 | 50 | NaN | 40 | NaN | 40 | 50 | NaN | 50 | 51 | 40 |
Diagonal 8 | 40 | 40 | 40 | 50 | NaN | 50 | 40 | NaN | NaN | 40 |
Full Hessian FH3 | 43 | 42 | 42 | 42 | 42 | 43 | 42 | 43 | NaN | NaN |
FLETCHCR (CUTE) | 17821 | 17632 | 18568 | 17272 | 17446 | 26794 | 24865 | NaN | 17315 | 20813 |
Test function | HCG1 | HCG2 | HCG3 | HCG4 | HCG5 | HCG6 | HCG7 | HCG8 | HCG9 | HCG10 |
Perturbed Quadratic | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 |
Raydan 2 | 100 | 100 | 100 | 134 | 176 | 182 | 100 | 158 | NaN | 282 |
Diagonal 2 | 6136 | 6217 | 6006 | 5923 | 5944 | 8281 | 8594 | 4822 | 5711 | 5636 |
Extended Tridiagonal 1 | 2369 | 1991 | 2275 | 4678 | 4924 | 22418 | 3119 | 2661 | NaN | 869 |
Diagonal 4 | 170 | 170 | 200 | 170 | 200 | 200 | 170 | 200 | NaN | 339 |
Diagonal 5 | 258 | 88 | 206 | 88 | 270 | 228 | 88 | 292 | 338 | 270 |
Extended Himmelblau | 855 | 687 | 583 | 763 | 813 | 961 | 757 | 1613 | 567 | 594 |
Full Hessian FH2 | 15115 | 15115 | 15115 | 14467 | 14467 | 14407 | 14989 | 14374 | 15495 | 17122 |
Perturbed quadratic diagonal | 3686 | 3647 | 3805 | 2805 | 3282 | 2967 | 2993 | 3053 | NaN | 8044 |
Quadratic QF1 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | NaN | 3484 |
Quadratic QF2 | 9455 | 9202 | 9501 | 9016 | 9054 | 10229 | 10086 | NaN | 9531 | 9085 |
TRIDIA (CUTE) | NaN | NaN | NaN | 18640 | 18640 | 16792 | NaN | NaN | 20260 | 22051 |
Almost Perturbed Quadratic | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 |
LIARWHD (CUTE) | 7712 | 5931 | 8275 | 6165 | 8113 | 9395 | 5854 | 10305 | NaN | 4848 |
POWER (CUTE) | 23356 | 23356 | 23356 | 23347 | 23347 | 23416 | 23353 | 23350 | NaN | 23353 |
NONSCOMP (CUTE) | 31355 | 33211 | 27801 | 32705 | 31458 | 40807 | 34013 | 23411 | 13367 | 35106 |
QUARTC (CUTE) | 901 | 1254 | 1261 | 1224 | 1224 | 916 | 1254 | 1041 | 1347 | 1305 |
Diagonal 6 | 100 | 100 | 100 | 134 | 176 | 182 | 100 | 158 | NaN | 282 |
DIXON3DQ (CUTE) | 36508 | 15534 | 33759 | 15534 | 35926 | 42952 | 15534 | 51284 | NaN | 36796 |
BIGGSB1 (CUTE) | 31960 | 15534 | 31427 | 15534 | 33247 | 40846 | 15534 | 48620 | NaN | 33469 |
Generalized Quartic | 457 | 371 | 370 | 371 | 481 | 529 | 371 | 439 | 446 | 467 |
Diagonal 7 | 160 | NaN | 130 | NaN | 130 | 160 | NaN | 160 | 142 | 13 |
Diagonal 8 | 130 | 130 | 130 | 160 | NaN | 160 | 130 | NaN | NaN | 130 |
Full Hessian FH3 | 139 | 136 | 136 | 136 | 136 | 139 | 136 | 139 | NaN | NaN |
FLETCHCR (CUTE) | 166463 | 165774 | 168739 | 175309 | 175845 | 240240 | 184939 | NaN | 174406 | 215687 |
Test function | HCG1 | HCG2 | HCG3 | HCG4 | HCG5 | HCG6 | HCG7 | HCG8 | HCG9 | HCG10 |
Perturbed Quadratic | 0.656 | 0.516 | 0.781 | 0.719 | 0.594 | 0.438 | 0.719 | 0.688 | 0.844 | 0.688 |
Raydan 2 | 0.031 | 0.063 | 0.078 | 0.078 | 0.078 | 0.078 | 0.078 | 0.078 | NaN | 0.078 |
Diagonal 2 | 1.453 | 1.328 | 1.656 | 1.172 | 1.438 | 1.797 | 1.813 | 1.266 | 1.250 | 1.141 |
Extended Tridiagonal 1 | 1.016 | 1.125 | 1.359 | 2.250 | 2.375 | 7.578 | 1.672 | 1.375 | NaN | 0.922 |
Diagonal 4 | 0.031 | 0.031 | 0.031 | 0.078 | 0.078 | 0.047 | 0.109 | 0.094 | NaN | 0.094 |
Diagonal 5 | 0.141 | 0.063 | 0.156 | 0.094 | 0.094 | 0.125 | 0.109 | 0.078 | 0.219 | 0.156 |
Extended Himmelblau | 0.172 | 0.172 | 0.109 | 0.141 | 0.172 | 0.141 | 0.125 | 0.141 | 0.172 | 0.125 |
Full Hessian FH2 | 83.125 | 91.938 | 86.984 | 85.766 | 94.484 | 78.281 | 77.141 | 74.500 | 80.969 | 82.469 |
Perturbed quadratic diagonal | 0.406 | 0.609 | 0.641 | 0.375 | 0.563 | 0.359 | 0.328 | 0.344 | NaN | 0.734 |
Quadratic QF1 | 0.359 | 0.438 | 0.422 | 0.422 | 0.406 | 0.391 | 0.484 | 0.422 | NaN | 0.281 |
Quadratic QF2 | 1.047 | 1.313 | 1.203 | 1.156 | 1.063 | 1.156 | 1.000 | NaN | 1.094 | 1.047 |
TRIDIA (CUTE) | NaN | NaN | NaN | 1.688 | 1.391 | 1.859 | NaN | NaN | 1.875 | 1.391 |
Almost Perturbed Quadratic | 0.406 | 0.438 | 0.516 | 0.594 | 0.250 | 0.359 | 0.406 | 0.578 | 0.641 | 0.422 |
LIARWHD (CUTE) | 0.938 | 0.828 | 1.203 | 0.797 | 1.125 | 1.172 | 0.938 | 1.203 | NaN | 0.594 |
POWER (CUTE) | 1.563 | 1.672 | 1.750 | 1.609 | 1.625 | 1.578 | 1.625 | 1.188 | NaN | 1.453 |
NONSCOMP (CUTE) | 1.547 | 1.484 | 1.063 | 1.766 | 1.422 | 1.719 | 1.516 | 1.063 | 1.203 | 1.703 |
QUARTC (CUTE) | 0.750 | 1.000 | 0.969 | 0.969 | 0.875 | 0.797 | 0.938 | 0.703 | 1.266 | 0.93 |
Diagonal 6 | 0.078 | 0.078 | 0.078 | 0.094 | 0.063 | 0.016 | 0.016 | 0.125 | NaN | 0.109 |
DIXON3DQ (CUTE) | 2.047 | 1.453 | 2.016 | 1.484 | 2.359 | 2.234 | 1.406 | 2.297 | NaN | 2.078 |
BIGGSB1 (CUTE) | 1.875 | 2.047 | 2.359 | 1.750 | 2.250 | 2.391 | 1.422 | 2.672 | NaN | 2.422 |
Generalized Quartic | 0.063 | 0.125 | 0.141 | 0.156 | 0.125 | 0.094 | 0.078 | 0.109 | 0.172 | 0.109 |
Diagonal 7 | 0.063 | NaN | 0.016 | NaN | 0.109 | 0.063 | NaN | 0.063 | 0.063 | 0.063 |
Diagonal 8 | 0.078 | 0.125 | 0.078 | 0.031 | NaN | 0.063 | 0.109 | NaN | NaN | 0.078 |
Full Hessian FH3 | 0.063 | 0.047 | 0.109 | 0.047 | 0.031 | 0.063 | 0.047 | 0.109 | NaN | NaN |
FLETCHCR (CUTE) | 5.656 | 6.750 | 7.922 | 9.484 | 6.484 | 8.766 | 7.281 | NaN | 6.906 | 7.547 |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Label | T1 | T2 | T3 | T4 | T5 | T6 |
Value of the scalar |
0.05 | 0.1 | 0.2 | 0.5 | 0.9 |
Method | T1 | T2 | T3 | T4 | T5 | T6 |
DHSDL | 32980.14 | 31281.32 | 33640.45 | 32942.36 | 34448.32 | 33872.36 |
DLSDL | 30694.00 | 28701.14 | 31048.32 | 30594.77 | 31926.59 | 31573.05 |
MHSDL | 29289.73 | 27653.64 | 29660.00 | 29713.50 | 30491.18 | 30197.27 |
MLSDL | 25398.82 | 22941.77 | 24758.27 | 24250.68 | 25722.64 | 25032.64 |
Method | T1 | T2 | T3 | T4 | T5 | T6 |
DHSDL | 1228585.50 | 1191960.55 | 1252957.09 | 1238044.36 | 1271176.59 | 1255710.45 |
DLSDL | 1131421.41 | 1083535.14 | 1149482.41 | 1134315.00 | 1167030.14 | 1158554.77 |
MHSDL | 1089700.41 | 1036710.32 | 1089777.64 | 1091985.41 | 1105299.91 | 1101380.18 |
MLSDL | 904217.14 | 845017.55 | 891669.50 | 879473.14 | 913165.68 | 895652.36 |
Method | T1 | T2 | T3 | T4 | T5 | T6 |
DHSDL | 902.06 | 894.73 | 917.77 | 930.56 | 911.28 | 870.93 |
DLSDL | 816.08 | 790.63 | 804.69 | 816.28 | 803.84 | 809.67 |
MHSDL | 770.78 | 751.65 | 728.61 | 749.70 | 712.64 | 720.57 |
MLSDL | 573.14 | 587.41 | 581.50 | 576.32 | 582.62 | 580.96 |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Quasi-Newton Eqs. | Ref. | |
[104] | ||
[72] | ||
[123] | ||
[133] | ||
[134] | ||
[59] |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Title | Year | Reference | |
1952 | [60] | ||
1964 | [48] | ||
1967 | [38] | ||
1969 | [102,103] | ||
1987 | [47] | ||
1991 | [79] | ||
1999 | [30] |
Denominator | |||
Numerator | |||
FR | DY | CD | |
PRP | HS | LS |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
IT profile | FE profile | CPU time | |||||||
Test function | AGD | MSM | SM | AGD | MSM | SM | AGD | MSM | SM |
Perturbed Quadratic | 353897 | 34828 | 59908 | 13916515 | 200106 | 337910 | 6756.047 | 116.281 | 185.641 |
Raydan 1 | 22620 | 26046 | 14918 | 431804 | 311260 | 81412 | 158.359 | 31.906 | 36.078 |
Diagonal 3 | 120416 | 7030 | 12827 | 4264718 | 38158 | 69906 | 5527.844 | 52.609 | 102.875 |
Generalized Tridiagonal 1 | 670 | 346 | 325 | 9334 | 1191 | 1094 | 11.344 | 1.469 | 1.203 |
Extended Tridiagonal 1 | 3564 | 1370 | 4206 | 14292 | 10989 | 35621 | 55.891 | 29.047 | 90.281 |
Extended TET | 443 | 156 | 156 | 3794 | 528 | 528 | 3.219 | 0.516 | 0.594 |
Diagonal 4 | 120 | 96 | 96 | 1332 | 636 | 636 | 0.781 | 0.203 | 0.141 |
Extended Himmelblau | 396 | 260 | 196 | 6897 | 976 | 668 | 1.953 | 0.297 | 0.188 |
Perturbed quadratic diagonal | 2542050 | 37454 | 44903 | 94921578 | 341299 | 460028 | 44978.750 | 139.625 | 185.266 |
Quadratic QF1 | 366183 | 36169 | 62927 | 13310016 | 208286 | 352975 | 12602.563 | 81.531 | 138.172 |
Extended quadratic penalty QP1 | 210 | 369 | 271 | 2613 | 2196 | 2326 | 1.266 | 1.000 | 0.797 |
Extended quadratic penalty QP2 | 395887 | 1674 | 3489 | 9852040 | 11491 | 25905 | 3558.734 | 3.516 | 6.547 |
Quadratic QF2 | 100286 | 32727 | 64076 | 3989239 | 183142 | 353935 | 1582.766 | 73.438 | 132.703 |
Extended quadratic exponential EP1 | 48 | 100 | 73 | 990 | 894 | 661 | 0.750 | 0.688 | 0.438 |
Extended Tridiagonal 2 | 1657 | 659 | 543 | 8166 | 2866 | 2728 | 3.719 | 1.047 | 1.031 |
ARWHEAD (CUTE) | 5667 | 430 | 270 | 214284 | 5322 | 3919 | 95.641 | 1.969 | 1.359 |
Almost Perturbed Quadratic | 356094 | 33652 | 60789 | 14003318 | 194876 | 338797 | 13337.125 | 73.047 | 133.516 |
LIARWHD (CUTE) | 1054019 | 3029 | 18691 | 47476667 | 27974 | 180457 | 27221.516 | 9.250 | 82.016 |
ENGVAL1 (CUTE) | 743 | 461 | 375 | 6882 | 2285 | 2702 | 3.906 | 1.047 | 1.188 |
QUARTC (CUTE) | 171 | 217 | 290 | 402 | 494 | 640 | 2.469 | 1.844 | 2.313 |
Generalized Quartic | 187 | 181 | 189 | 849 | 493 | 507 | 0.797 | 0.281 | 0.188 |
Diagonal 7 | 72 | 147 | 108 | 333 | 504 | 335 | 0.625 | 0.547 | 0.375 |
Diagonal 8 | 60 | 120 | 118 | 304 | 383 | 711 | 0.438 | 0.469 | 0.797 |
Full Hessian FH3 | 45 | 63 | 63 | 1352 | 566 | 631 | 1.438 | 0.391 | 0.391 |
Diagonal 9 | 329768 | 10540 | 13619 | 13144711 | 68189 | 89287 | 6353.172 | 43.609 | 38.672 |
IT profile | FE profile | CPU time | |||||||
Test function | HS | PRP | LS | HS | PRP | LS | HS | PRP | LS |
Perturbed Quadratic | 1157 | 1157 | 6662 | 3481 | 3481 | 19996 | 0.234 | 0.719 | 1.438 |
Raydan 2 | NaN | 174 | 40 | NaN | 373 | 120 | NaN | 0.094 | 0.078 |
Diagonal 2 | NaN | 1721 | 5007 | NaN | 6594 | 15498 | NaN | 1.313 | 2.891 |
Extended Tridiagonal 1 | NaN | 170 | 17079 | NaN | 560 | 54812 | NaN | 0.422 | 13.641 |
Diagonal 4 | NaN | 70 | 1927 | NaN | 180 | 5739 | NaN | 0.078 | 0.391 |
Diagonal 5 | NaN | 154 | 30 | NaN | 338 | 90 | NaN | 0.172 | 0.078 |
Extended Himmelblau | 160 | 120 | 241 | 820 | 600 | 1043 | 0.172 | 0.125 | 0.172 |
Full Hessian FH2 | 5096 | 5686 | 348414 | 15294 | 17065 | 1045123 | 83.891 | 80.625 | 5081.875 |
Perturbed quadratic diagonal | 1472 | 1120 | 21667 | 4419 | 3363 | 65057 | 0.438 | 0.391 | 2.547 |
Quadratic QF1 | 1158 | 1158 | 5612 | 3484 | 3484 | 16813 | 0.281 | 0.313 | 1.047 |
Extended quadratic penalty QP2 | NaN | 533 | NaN | NaN | 5395 | NaN | NaN | 0.781 | NaN |
Quadratic QF2 | 2056 | 2311 | NaN | 9168 | 9862 | NaN | 0.969 | 0.859 | NaN |
Extended quadratic exponential EP1 | NaN | NaN | 70 | NaN | NaN | 350 | NaN | NaN | 0.141 |
TRIDIA (CUTE) | 6835 | 6744 | NaN | 20521 | 20248 | NaN | 1.438 | 1.094 | NaN |
Almost Perturbed Quadratic | 1158 | 1158 | 5996 | 3484 | 3484 | 17998 | 0.281 | 0.328 | 1.063 |
LIARWHD (CUTE) | NaN | 408 | 11498 | NaN | 4571 | 50814 | NaN | 0.438 | 2.969 |
POWER (CUTE) | 7781 | 7789 | 190882 | 23353 | 23377 | 572656 | 1.422 | 1.219 | 14.609 |
NONSCOMP (CUTE) | 4545 | 3647 | NaN | 15128 | 12433 | NaN | 0.875 | 0.656 | NaN |
QUARTC (CUTE) | NaN | 165 | 155 | NaN | 1347 | 1466 | NaN | 0.781 | 0.766 |
Diagonal 6 | NaN | 174 | 137 | NaN | 373 | 442 | NaN | 0.109 | 0.125 |
DIXON3DQ (CUTE) | NaN | 12595 | 12039 | NaN | 37714 | 36091 | NaN | 1.641 | 2.859 |
BIGGSB1 (CUTE) | NaN | 11454 | 11517 | NaN | 34293 | 34530 | NaN | 1.969 | 2.141 |
Generalized Quartic | NaN | 134 | 139 | NaN | 458 | 445 | NaN | 0.125 | 0.094 |
Diagonal 7 | NaN | 51 | 80 | NaN | 142 | 240 | NaN | 0.063 | 0.109 |
Diagonal 8 | NaN | 70 | 80 | NaN | 180 | 180 | NaN | 0.063 | 0.125 |
FLETCHCR (CUTE) | 18292 | 19084 | 20354 | 178305 | 170266 | 171992 | 8.859 | 6.203 | 7.484 |
IT profile | FE profile | CPU time | |||||||
Test function | DY | FR | CD | DY | FR | CD | DY | FR | CD |
Perturbed Quadratic | 1157 | 1157 | 1157 | 3481 | 3481 | 3481 | 0.469 | 0.609 | 0.531 |
Raydan 2 | 86 | 40 | 40 | 192 | 100 | 100 | 0.063 | 0.016 | 0.016 |
Diagonal 2 | 1636 | 3440 | 2058 | 4774 | 7982 | 8063 | 0.922 | 1.563 | 1.297 |
Extended Tridiagonal 1 | 2081 | 690 | 1140 | 4639 | 2022 | 2984 | 1.703 | 1.141 | 1.578 |
Diagonal 4 | 70 | 70 | 70 | 200 | 200 | 200 | 0.047 | 0.031 | 0.016 |
Diagonal 5 | 40 | 124 | 155 | 100 | 258 | 320 | 0.109 | 0.141 | 0.125 |
Extended Himmelblau | 383 | 339 | 207 | 1669 | 1467 | 961 | 0.219 | 0.172 | 0.172 |
Full Hessian FH2 | 4682 | 4868 | 4794 | 14054 | 14610 | 14390 | 65.938 | 66.469 | 65.922 |
Perturbed quadratic diagonal | 1036 | 1084 | 1276 | 3114 | 3258 | 3834 | 0.406 | 0.422 | 0.422 |
Quadratic QF1 | 1158 | 1158 | 1158 | 3484 | 3484 | 3484 | 0.297 | 0.297 | 0.328 |
Quadratic QF2 | NaN | NaN | 2349 | NaN | NaN | 10073 | NaN | NaN | 1.531 |
Extended quadratic exponential EP1 | NaN | 60 | 60 | NaN | 310 | 310 | NaN | 0.109 | 0.125 |
Almost Perturbed Quadratic | 1158 | 1158 | 1158 | 3484 | 3484 | 3484 | 0.422 | 0.453 | 0.391 |
LIARWHD (CUTE) | 2812 | 1202 | 1255 | 12366 | 7834 | 7379 | 0.938 | 1.000 | 1.109 |
POWER (CUTE) | 7779 | 7781 | 7782 | 23347 | 23353 | 23356 | 1.078 | 1.500 | 1.328 |
NONSCOMP (CUTE) | 2558 | 13483 | 10901 | 49960 | 43268 | 33413 | 1.203 | 1.406 | 1.422 |
QUARTC (CUTE) | 134 | 94 | 95 | 1132 | 901 | 916 | 0.688 | 0.672 | 0.563 |
Diagonal 6 | 86 | 40 | 40 | 192 | 100 | 100 | 0.047 | 0.063 | 0.063 |
DIXON3DQ (CUTE) | 16047 | 18776 | 19376 | 48172 | 56369 | 58176 | 2.266 | 2.516 | 2.734 |
BIGGSB1 (CUTE) | 15274 | 17835 | 18374 | 45853 | 53546 | 55170 | 2.875 | 2.922 | 2.484 |
Generalized Quartic | 142 | 214 | 173 | 497 | 712 | 589 | 0.078 | 0.172 | 0.109 |
Diagonal 7 | 50 | 50 | 50 | 160 | 160 | 160 | 0.063 | 0.047 | 0.094 |
Diagonal 8 | 50 | 40 | 40 | 160 | 130 | 130 | 0.109 | 0.125 | 0.063 |
Full Hessian FH3 | 43 | 43 | 43 | 139 | 139 | 139 | 0.063 | 0.109 | 0.109 |
FLETCHCR (CUTE) | NaN | NaN | 26793 | NaN | NaN | 240237 | NaN | NaN | 10.203 |
Test function | HCG1 | HCG2 | HCG3 | HCG4 | HCG5 | HCG6 | HCG7 | HCG8 | HCG9 | HCG10 |
Perturbed Quadratic | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 |
Raydan 2 | 40 | 40 | 40 | 57 | 78 | 81 | 40 | 69 | NaN | 126 |
Diagonal 2 | 1584 | 1581 | 1542 | 1488 | 1500 | 2110 | 2193 | 1843 | 1475 | 1453 |
Extended Tridiagonal 1 | 805 | 623 | 754 | 2110 | 2160 | 10129 | 1167 | 966 | NaN | 270 |
Diagonal 4 | 60 | 60 | 70 | 60 | 70 | 70 | 60 | 70 | NaN | 113 |
Diagonal 5 | 124 | 39 | 98 | 39 | 120 | 109 | 39 | 141 | 154 | 130 |
Extended Himmelblau | 145 | 139 | 111 | 161 | 181 | 207 | 159 | 381 | 109 | 108 |
Full Hessian FH2 | 5036 | 5036 | 5036 | 4820 | 4820 | 4800 | 4994 | 4789 | 5163 | 5705 |
Perturbed quadratic diagonal | 1228 | 1214 | 1266 | 934 | 1093 | 987 | 996 | 1016 | NaN | 2679 |
Quadratic QF1 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | NaN | 1158 |
Quadratic QF2 | 2125 | 2098 | 2174 | 1995 | 1991 | 2425 | 2378 | NaN | 2204 | 2034 |
TRIDIA (CUTE) | NaN | NaN | NaN | 6210 | 6210 | 5594 | NaN | NaN | 6748 | 7345 |
Almost Perturbed Quadratic | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 |
LIARWHD (CUTE) | 1367 | 817 | 1592 | 1024 | 1831 | 1774 | 531 | 2152 | NaN | 573 |
POWER (CUTE) | 7782 | 7782 | 7782 | 7779 | 7779 | 7802 | 7781 | 7780 | NaN | 7781 |
NONSCOMP (CUTE) | 10092 | 10746 | 8896 | 10466 | 9972 | 13390 | 11029 | 3520 | 3988 | 11411 |
QUARTC (CUTE) | 94 | 160 | 145 | 150 | 126 | 95 | 160 | 114 | 165 | 154 |
Diagonal 6 | 40 | 40 | 40 | 57 | 78 | 81 | 40 | 69 | NaN | 126 |
DIXON3DQ (CUTE) | 12182 | 5160 | 11257 | 5160 | 11977 | 14302 | 5160 | 17080 | NaN | 12264 |
BIGGSB1 (CUTE) | 10664 | 5160 | 10479 | 5160 | 11082 | 13600 | 5160 | 16192 | NaN | 11151 |
Generalized Quartic | 129 | 107 | 110 | 107 | 142 | 153 | 107 | 123 | 131 | 145 |
Diagonal 7 | 50 | NaN | 40 | NaN | 40 | 50 | NaN | 50 | 51 | 40 |
Diagonal 8 | 40 | 40 | 40 | 50 | NaN | 50 | 40 | NaN | NaN | 40 |
Full Hessian FH3 | 43 | 42 | 42 | 42 | 42 | 43 | 42 | 43 | NaN | NaN |
FLETCHCR (CUTE) | 17821 | 17632 | 18568 | 17272 | 17446 | 26794 | 24865 | NaN | 17315 | 20813 |
Test function | HCG1 | HCG2 | HCG3 | HCG4 | HCG5 | HCG6 | HCG7 | HCG8 | HCG9 | HCG10 |
Perturbed Quadratic | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 |
Raydan 2 | 100 | 100 | 100 | 134 | 176 | 182 | 100 | 158 | NaN | 282 |
Diagonal 2 | 6136 | 6217 | 6006 | 5923 | 5944 | 8281 | 8594 | 4822 | 5711 | 5636 |
Extended Tridiagonal 1 | 2369 | 1991 | 2275 | 4678 | 4924 | 22418 | 3119 | 2661 | NaN | 869 |
Diagonal 4 | 170 | 170 | 200 | 170 | 200 | 200 | 170 | 200 | NaN | 339 |
Diagonal 5 | 258 | 88 | 206 | 88 | 270 | 228 | 88 | 292 | 338 | 270 |
Extended Himmelblau | 855 | 687 | 583 | 763 | 813 | 961 | 757 | 1613 | 567 | 594 |
Full Hessian FH2 | 15115 | 15115 | 15115 | 14467 | 14467 | 14407 | 14989 | 14374 | 15495 | 17122 |
Perturbed quadratic diagonal | 3686 | 3647 | 3805 | 2805 | 3282 | 2967 | 2993 | 3053 | NaN | 8044 |
Quadratic QF1 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | NaN | 3484 |
Quadratic QF2 | 9455 | 9202 | 9501 | 9016 | 9054 | 10229 | 10086 | NaN | 9531 | 9085 |
TRIDIA (CUTE) | NaN | NaN | NaN | 18640 | 18640 | 16792 | NaN | NaN | 20260 | 22051 |
Almost Perturbed Quadratic | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 |
LIARWHD (CUTE) | 7712 | 5931 | 8275 | 6165 | 8113 | 9395 | 5854 | 10305 | NaN | 4848 |
POWER (CUTE) | 23356 | 23356 | 23356 | 23347 | 23347 | 23416 | 23353 | 23350 | NaN | 23353 |
NONSCOMP (CUTE) | 31355 | 33211 | 27801 | 32705 | 31458 | 40807 | 34013 | 23411 | 13367 | 35106 |
QUARTC (CUTE) | 901 | 1254 | 1261 | 1224 | 1224 | 916 | 1254 | 1041 | 1347 | 1305 |
Diagonal 6 | 100 | 100 | 100 | 134 | 176 | 182 | 100 | 158 | NaN | 282 |
DIXON3DQ (CUTE) | 36508 | 15534 | 33759 | 15534 | 35926 | 42952 | 15534 | 51284 | NaN | 36796 |
BIGGSB1 (CUTE) | 31960 | 15534 | 31427 | 15534 | 33247 | 40846 | 15534 | 48620 | NaN | 33469 |
Generalized Quartic | 457 | 371 | 370 | 371 | 481 | 529 | 371 | 439 | 446 | 467 |
Diagonal 7 | 160 | NaN | 130 | NaN | 130 | 160 | NaN | 160 | 142 | 13 |
Diagonal 8 | 130 | 130 | 130 | 160 | NaN | 160 | 130 | NaN | NaN | 130 |
Full Hessian FH3 | 139 | 136 | 136 | 136 | 136 | 139 | 136 | 139 | NaN | NaN |
FLETCHCR (CUTE) | 166463 | 165774 | 168739 | 175309 | 175845 | 240240 | 184939 | NaN | 174406 | 215687 |
Test function | HCG1 | HCG2 | HCG3 | HCG4 | HCG5 | HCG6 | HCG7 | HCG8 | HCG9 | HCG10 |
Perturbed Quadratic | 0.656 | 0.516 | 0.781 | 0.719 | 0.594 | 0.438 | 0.719 | 0.688 | 0.844 | 0.688 |
Raydan 2 | 0.031 | 0.063 | 0.078 | 0.078 | 0.078 | 0.078 | 0.078 | 0.078 | NaN | 0.078 |
Diagonal 2 | 1.453 | 1.328 | 1.656 | 1.172 | 1.438 | 1.797 | 1.813 | 1.266 | 1.250 | 1.141 |
Extended Tridiagonal 1 | 1.016 | 1.125 | 1.359 | 2.250 | 2.375 | 7.578 | 1.672 | 1.375 | NaN | 0.922 |
Diagonal 4 | 0.031 | 0.031 | 0.031 | 0.078 | 0.078 | 0.047 | 0.109 | 0.094 | NaN | 0.094 |
Diagonal 5 | 0.141 | 0.063 | 0.156 | 0.094 | 0.094 | 0.125 | 0.109 | 0.078 | 0.219 | 0.156 |
Extended Himmelblau | 0.172 | 0.172 | 0.109 | 0.141 | 0.172 | 0.141 | 0.125 | 0.141 | 0.172 | 0.125 |
Full Hessian FH2 | 83.125 | 91.938 | 86.984 | 85.766 | 94.484 | 78.281 | 77.141 | 74.500 | 80.969 | 82.469 |
Perturbed quadratic diagonal | 0.406 | 0.609 | 0.641 | 0.375 | 0.563 | 0.359 | 0.328 | 0.344 | NaN | 0.734 |
Quadratic QF1 | 0.359 | 0.438 | 0.422 | 0.422 | 0.406 | 0.391 | 0.484 | 0.422 | NaN | 0.281 |
Quadratic QF2 | 1.047 | 1.313 | 1.203 | 1.156 | 1.063 | 1.156 | 1.000 | NaN | 1.094 | 1.047 |
TRIDIA (CUTE) | NaN | NaN | NaN | 1.688 | 1.391 | 1.859 | NaN | NaN | 1.875 | 1.391 |
Almost Perturbed Quadratic | 0.406 | 0.438 | 0.516 | 0.594 | 0.250 | 0.359 | 0.406 | 0.578 | 0.641 | 0.422 |
LIARWHD (CUTE) | 0.938 | 0.828 | 1.203 | 0.797 | 1.125 | 1.172 | 0.938 | 1.203 | NaN | 0.594 |
POWER (CUTE) | 1.563 | 1.672 | 1.750 | 1.609 | 1.625 | 1.578 | 1.625 | 1.188 | NaN | 1.453 |
NONSCOMP (CUTE) | 1.547 | 1.484 | 1.063 | 1.766 | 1.422 | 1.719 | 1.516 | 1.063 | 1.203 | 1.703 |
QUARTC (CUTE) | 0.750 | 1.000 | 0.969 | 0.969 | 0.875 | 0.797 | 0.938 | 0.703 | 1.266 | 0.93 |
Diagonal 6 | 0.078 | 0.078 | 0.078 | 0.094 | 0.063 | 0.016 | 0.016 | 0.125 | NaN | 0.109 |
DIXON3DQ (CUTE) | 2.047 | 1.453 | 2.016 | 1.484 | 2.359 | 2.234 | 1.406 | 2.297 | NaN | 2.078 |
BIGGSB1 (CUTE) | 1.875 | 2.047 | 2.359 | 1.750 | 2.250 | 2.391 | 1.422 | 2.672 | NaN | 2.422 |
Generalized Quartic | 0.063 | 0.125 | 0.141 | 0.156 | 0.125 | 0.094 | 0.078 | 0.109 | 0.172 | 0.109 |
Diagonal 7 | 0.063 | NaN | 0.016 | NaN | 0.109 | 0.063 | NaN | 0.063 | 0.063 | 0.063 |
Diagonal 8 | 0.078 | 0.125 | 0.078 | 0.031 | NaN | 0.063 | 0.109 | NaN | NaN | 0.078 |
Full Hessian FH3 | 0.063 | 0.047 | 0.109 | 0.047 | 0.031 | 0.063 | 0.047 | 0.109 | NaN | NaN |
FLETCHCR (CUTE) | 5.656 | 6.750 | 7.922 | 9.484 | 6.484 | 8.766 | 7.281 | NaN | 6.906 | 7.547 |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Label | T1 | T2 | T3 | T4 | T5 | T6 |
Value of the scalar |
0.05 | 0.1 | 0.2 | 0.5 | 0.9 |
Method | T1 | T2 | T3 | T4 | T5 | T6 |
DHSDL | 32980.14 | 31281.32 | 33640.45 | 32942.36 | 34448.32 | 33872.36 |
DLSDL | 30694.00 | 28701.14 | 31048.32 | 30594.77 | 31926.59 | 31573.05 |
MHSDL | 29289.73 | 27653.64 | 29660.00 | 29713.50 | 30491.18 | 30197.27 |
MLSDL | 25398.82 | 22941.77 | 24758.27 | 24250.68 | 25722.64 | 25032.64 |
Method | T1 | T2 | T3 | T4 | T5 | T6 |
DHSDL | 1228585.50 | 1191960.55 | 1252957.09 | 1238044.36 | 1271176.59 | 1255710.45 |
DLSDL | 1131421.41 | 1083535.14 | 1149482.41 | 1134315.00 | 1167030.14 | 1158554.77 |
MHSDL | 1089700.41 | 1036710.32 | 1089777.64 | 1091985.41 | 1105299.91 | 1101380.18 |
MLSDL | 904217.14 | 845017.55 | 891669.50 | 879473.14 | 913165.68 | 895652.36 |
Method | T1 | T2 | T3 | T4 | T5 | T6 |
DHSDL | 902.06 | 894.73 | 917.77 | 930.56 | 911.28 | 870.93 |
DLSDL | 816.08 | 790.63 | 804.69 | 816.28 | 803.84 | 809.67 |
MHSDL | 770.78 | 751.65 | 728.61 | 749.70 | 712.64 | 720.57 |
MLSDL | 573.14 | 587.41 | 581.50 | 576.32 | 582.62 | 580.96 |