Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Classification using supervised learning requires annotating a large amount of classes-balanced data for model training and testing. This has practically limited the scope of applications with supervised learning, in particular deep learning. To address the issues associated with limited and imbalanced data, this paper introduces a sample-efficient co-supervised learning paradigm (SEC-CGAN), in which a conditional generative adversarial network (CGAN) is trained alongside the classifier and supplements semantics-conditioned, confidence-aware synthesized examples to the annotated data during the training process. In this setting, the CGAN not only serves as a co-supervisor but also provides complementary quality examples to aid the classifier training in an end-to-end fashion. Experiments demonstrate that the proposed SEC-CGAN outperforms the external classifier GAN (EC-GAN) and a baseline ResNet-18 classifier. For the comparison, all classifiers in above methods adopt the ResNet-18 architecture as the backbone. Particularly, for the Street View House Numbers dataset, using the 5% of training data, a test accuracy of 90.26% is achieved by SEC-CGAN as opposed to 88.59% by EC-GAN and 87.17% by the baseline classifier; for the highway image dataset, using the 10% of training data, a test accuracy of 98.27% is achieved by SEC-CGAN, compared to 97.84% by EC-GAN and 95.52% by the baseline classifier.
Citation: Hao Zhen, Yucheng Shi, Jidong J. Yang, Javad Mohammadpour Vehni. Co-supervised learning paradigm with conditional generative adversarial networks for sample-efficient classification[J]. Applied Computing and Intelligence, 2023, 3(1): 13-26. doi: 10.3934/aci.2023002
[1] | Predrag S. Stanimirović, Branislav Ivanov, Haifeng Ma, Dijana Mosić . A survey of gradient methods for solving nonlinear optimization. Electronic Research Archive, 2020, 28(4): 1573-1624. doi: 10.3934/era.2020115 |
[2] | Yan Xia, Songhua Wang . Global convergence in a modified RMIL-type conjugate gradient algorithm for nonlinear systems of equations and signal recovery. Electronic Research Archive, 2024, 32(11): 6153-6174. doi: 10.3934/era.2024286 |
[3] | Gonglin Yuan, Minjie Huang . An efficient modified HS conjugate gradient algorithm in machine learning. Electronic Research Archive, 2024, 32(11): 6175-6199. doi: 10.3934/era.2024287 |
[4] | Youjun Deng, Hongyu Liu, Xianchao Wang, Dong Wei, Liyan Zhu . Simultaneous recovery of surface heat flux and thickness of a solid structure by ultrasonic measurements. Electronic Research Archive, 2021, 29(5): 3081-3096. doi: 10.3934/era.2021027 |
[5] | Yue Feng, Yujie Liu, Ruishu Wang, Shangyou Zhang . A conforming discontinuous Galerkin finite element method on rectangular partitions. Electronic Research Archive, 2021, 29(3): 2375-2389. doi: 10.3934/era.2020120 |
[6] | Jiaxin Zhang, Hoang Tran, Guannan Zhang . Accelerating reinforcement learning with a Directional-Gaussian-Smoothing evolution strategy. Electronic Research Archive, 2021, 29(6): 4119-4135. doi: 10.3934/era.2021075 |
[7] | Sida Lin, Lixia Meng, Jinlong Yuan, Changzhi Wu, An Li, Chongyang Liu, Jun Xie . Sequential adaptive switching time optimization technique for maximum hands-off control problems. Electronic Research Archive, 2024, 32(4): 2229-2250. doi: 10.3934/era.2024101 |
[8] | Simon Eberle, Arnulf Jentzen, Adrian Riekert, Georg S. Weiss . Existence, uniqueness, and convergence rates for gradient flows in the training of artificial neural networks with ReLU activation. Electronic Research Archive, 2023, 31(5): 2519-2554. doi: 10.3934/era.2023128 |
[9] | Cheng Wang . Convergence analysis of Fourier pseudo-spectral schemes for three-dimensional incompressible Navier-Stokes equations. Electronic Research Archive, 2021, 29(5): 2915-2944. doi: 10.3934/era.2021019 |
[10] | Yineng Ouyang, Zhaotao Liang, Zhihui Ma, Lei Wang, Zhaohua Gong, Jun Xie, Kuikui Gao . A class of constrained optimal control problems arising in an immunotherapy cancer remission process. Electronic Research Archive, 2024, 32(10): 5868-5888. doi: 10.3934/era.2024271 |
Classification using supervised learning requires annotating a large amount of classes-balanced data for model training and testing. This has practically limited the scope of applications with supervised learning, in particular deep learning. To address the issues associated with limited and imbalanced data, this paper introduces a sample-efficient co-supervised learning paradigm (SEC-CGAN), in which a conditional generative adversarial network (CGAN) is trained alongside the classifier and supplements semantics-conditioned, confidence-aware synthesized examples to the annotated data during the training process. In this setting, the CGAN not only serves as a co-supervisor but also provides complementary quality examples to aid the classifier training in an end-to-end fashion. Experiments demonstrate that the proposed SEC-CGAN outperforms the external classifier GAN (EC-GAN) and a baseline ResNet-18 classifier. For the comparison, all classifiers in above methods adopt the ResNet-18 architecture as the backbone. Particularly, for the Street View House Numbers dataset, using the 5% of training data, a test accuracy of 90.26% is achieved by SEC-CGAN as opposed to 88.59% by EC-GAN and 87.17% by the baseline classifier; for the highway image dataset, using the 10% of training data, a test accuracy of 98.27% is achieved by SEC-CGAN, compared to 97.84% by EC-GAN and 95.52% by the baseline classifier.
In this survey, we focus on solving the unconstrained nonlinear optimization problem
minf(x), x∈Rn, | (1.1) |
where the function
xk+1=xk+αkdk, k≥0, | (1.2) |
where the step-size
The following notations will be used, as usual:
g(x)=∇f(x), G(x)=∇2f(x), gk=∇f(xk), Gk=∇2f(xk), |
where
f(xk+1)≈f(xk)+αkgTkdk. | (1.3) |
Therefore, an appropriate descent search direction
gTkdk<0,for allk. | (1.4) |
Primary choice for descent direction is
xk+1=xk−αkgk. | (1.5) |
In this paper, we survey gradient methods satisfying the descent condition (1.4). If there exists a constant
gTkdk<−c‖gk‖2,for allk, | (1.6) |
where
As for the choice of search direction, one of the possible choices for the search direction in unconditional optimization is to move from the current point along the negative gradient in each iteration, which correspond to
Advantages and disadvantages of GD methods can be summarized as follows.
Another important direction of the search is the Newton's direction
Φ(d):=f(xk+d)≈f(xk)+gTkd+12dTGkd. | (1.7) |
The solution
dk=−G−1kgk. |
So, the pure Newton method is defined by
xk+1=xk−G−1kgk. | (1.8) |
The Newton method with line search uses an appropriate step-size
xk+1=xk−αkG−1kgk, | (1.9) |
wherein the step-size
The Newton method exhibits three major drawbacks in practical applications.
Due to that, numerous modifications of it were created, which can be globally divided into two large groups: modified Newton's methods and quasi-Newton (QN) methods. The QN methods are aimed to address all the above difficulties of the Newton method. The first drawback is overcome by taking an appropriately defined positive definite matrix
QN methods and modified Newton methods belong to most powerful methods for solving unconditional optimization and applicable in many nonlinear optimization problems. A survey of QN methods for solving nonlinear least-squares problems was considered in [86]. The optimization methods have found a number of applications in fluid mechanics [44], free surface flow and solid body contact [13], finding the optimal trajectory for an aircraft or a robot arm, designing a portfolio of investments, controlling a chemical process, computing the optimal shape of an automobile. See [90] for more details. A modification of the quasi-Newton method in defining the two-phase approximate greatest descent was used in [71]. Several variants of multi-step spectral gradient methods for solving large scale unconstrained optimization problems were proposed in [111]. Usage of an optimization algorithm in artificial neural networks was considered in [75]. Properties of Hessian matrix which appear in distributed gradient-based multi-agent control systems was considered in [117]. A survey of derivative-free optimization methods was given in [127]. An application of unconstrained optimization in solving the risk probability was presented in [76].
The study of conjugate gradient (CG) methods was started by Hestenes and Stiefel in 1952 in [60], and the development of CG methods for solving large-scale unconstrained optimization problems is still ongoing. After all these years, there is still a need to find a more efficient CG method that will solve unconstrained optimization problems with thousands of variables in the shortest possible time interval as well as with a minimal number of iterations and function evaluations.
CG methods construct a sequence of approximations
dk:=dk:=d(βk,gk,dk−1)={−g0,k=0,−gk+βkdk−1,k≥1, | (1.10) |
where
Popularity of CG methods is confirmed by a number of recent surveys and book chapters [42,57,85,87,88]. In addition to this basic information on the chronological development of the CG methods, it is also important to mention its applications. In general, CG methods are important in solving large-scale optimization problems. CG methods iterates are characterized by low memory allocation and strong local and global convergence properties. Based on this fact, these methods become useful in all areas where optimization problems of any kind are involved. The CG methods have wide use in solving systems of equations and image restoration problem [12,16,70,78,128,135,136,140], the linear response eigenvalue problem [74], in regression analysis [108,143]. On that way, CG methods have the influence on the development of an artificial neural networks learning algorithms [46,75]. A unique approach to the ABS type CG methods was proposed in [1]. Application of CG methods in solving very large symmetric positive semi definite linear systems that appear in optimal surface parameterizations are described in [64]. Also, it is possible to mention application in data analysis [110]. A variant of the projected preconditioned conjugate gradient method and its application in solving the linear response eigenvalue problem was investigated in [74].
Main goals leading current research paper can be highlighted as follows.
The overall structure of the paper based on contents of each section is described as follows. Section 1 describes basic notation, introductory notions, preliminaries and motivation. Global algorithms and various line search variants are presented in Section 2. Overview of QN methods and their classification are considered in Section 3. Section 4 gives a specific overview of CG methods. Convergence properties of considered CG methods are investigated in Section 5. According to the presented taxonomy of basic CG methods, properties of CG methods with
First, we present an algorithm that describes the general scheme of line search methods
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
To achieve the global convergence of iterative methods, an appropriate step-size
Φ(α):=f(xk+αdk) | (2.1) |
and the step-size is defined after the unidimensional optimization of the form
f(xk+αkdk)=minα>0Φ(α). | (2.2) |
The ELS rule may give the most precise minimum. However, ELS is too expensive in practice or even impossible to implement, especially in situations where the iteration is far from the exact solution.
Applying the iterative procedure (1.2), it is most logical to choose a new point so that the step length
Φ(αk)=f(xk+1)<Φ(0)=f(xk). | (2.3) |
Methods that in each iterative step require the fulfillment of conditions (2.3), that is, the reduction of the value of the objective function, define iterations that in each step approach the minimum of the given function. The methods conceived in this way belong to the class of methods of monotone line search. Many variants of inexact line search (ILS) rules are proposed and dominant in the nonlinear optimization. The most commonly used ILS techniques are Armijo, Goldstein, Wolfe, Powel, Fletcher and other [4,8,19,50,55,56,109,126]. In most conjugate gradient methods, one of the next ILS procedures methods is used to calculate the step length
In contrast to the monotonic line search, the non-monotonic line search is also known in the literature, where it is not necessary to reduce the value of the objective function in each iteration [51,52,53]. Although non-monotone techniques do not provide a minimum approach to the function in each iteration, they are very common in practical applications and have very good convergence properties. A number of nonmonotonic linear search methods have been proposed recently (see, for example, [119,120]).
A backtracking line search scheme based on the Armijo condition, is aimed to determining the maximal value during the moving along a given search vector. It starts with a relatively large step-size estimate and iteratively reduces the step-size value until a decrease in the value of the objective function is observed, according to the local gradient of the goal function. Let
f(xk+βmktdk)≤f(xk)+φβmktgTkdk, t>0. | (2.4) |
The procedure for backtracking line search proposed in [4] starts from the initial value
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
In order to ensure a sufficient decrease of the objective function, Goldstein rule for ILS requires the following conditions:
f(xk+αdk)≤f(xk)+ρtgTkdk | (2.5) |
and
f(xk+αdk)≥f(xk)+(1−ρ)tgTkdk, | (2.6) |
where
Wolfe line search conditions are well-known and these are given by
f(xk+αkdk)≤f(xk)+ηαkgTkdk, | (2.7) |
g(xk+αkdk)Tdk≥σ1gTkdk, | (2.8) |
where
−σ2gTkdk≥g(xk+αkdk)Tdk≥σ1gTkdk | (2.9) |
are often used, where
|g(xk+αkdk)Tdk|≤−σ1gTkdk. | (2.10) |
The condition (2.7) of the Wolfe conditions is called the Armijo condition, which is often used apart or in the form of its variants.
The most general iterative rule of QN type with line search is of the form
xk+1=xk−αkHkgk, | (3.1) |
such that
sk=xk+1−xk,yk=gk+1−gk | (3.2) |
are typical. The update
Bk+1=Bk+Ek, | (3.3) |
where
Bk+1sk=yk. | (3.4) |
The quasi-Newton condition for the matrix
Hk+1yk=sk. | (3.5) |
Methods that require the calculation or approximation of the Hessian matrix and its inverse belong to the class of QN methods as well as its numerous modifications. The pure Newton method requires calculation of second derivatives matrix, which is avoided in QN methods. As a consequence, the Newton method is computationally expensive and exhibits slow computation, while QN methods are computationally cheap and of faster computation. On the other hand, the Newton method requires lesser number of iterative steps and generates more precise convergence path than QN methods.
The Symmetric Rank-One update (SR1) assumes the matrix
Hk+1=Hk+Ek, |
where
Hk+1=Hk+ukvTk, | (3.6) |
where
uk,vk∈Rn. |
The quasi-Newton condition (3.5) initiates
Hk+1yk=(Hk+ukvTk)yk=sk, |
that is
(vTkyk)uk=sk−Hkyk. | (3.7) |
The conclusion is that
sk−Hkyk≠0, |
(otherwise
Hk+1=Hk+1vTkyk(sk−Hkyk)vTk. | (3.8) |
The condition that the approximation
Hk+1=Hk+(sk−Hkyk)(sk−Hkyk)T(sk−Hkyk)Tyk, | (3.9) |
which is the Symmetric Rank One (SR1) update.
For general functions, Conn, Gould and Toint in [20] proved that the sequence of SR1 Hessian approximations converges to the true Hessian provided that the steps are uniformly linearly independent.
The Hessian update
minB‖B−Bk‖, s.t. B=BT, Bsk=yk. | (3.10) |
The solution to (3.10) is equal to
BDFPk+1=(I−ρkyksTk)BDFPk(I−ρkskyTk)+ρkykyTk, ϱk=1yTksk. |
The inverse Hessian update can be generated using the Sherman - Morrison - Woodbury.
Moreover, the DFP update is known as a method of updating, of rank
Hk+1=Hk+aukuTk+bvkvTk, |
where
Hkyk+aukuTkyk+bvkvTkyk=sk. | (3.11) |
The vectors
uk=sk,vk=Hkyk. |
Now, (3.11) implies
a=1uTkyk=1sTkyk,b=−1vTkyk=−1yTkHkyk. |
So we get
HDFPk+1=Hk+sksTksTkyk−HkykyTkHkyTkHkyk. | (3.12) |
Formula (3.12) was proposed by Davidon and later was developed by Fletcher and Powell, so that it is called DFP update.
One famous broadly used updating formula is the Broyden-Fletcher-Goldfarb–Shanno (BFGS) rule. The inverse Hessian update
minH‖H−Hk‖, s.t. H=HT, Hyk=sk. | (3.13) |
Certainly, the BFGS update is overtly known as
HBFGSk+1=(I−ρkskyTk)HBFGSk(I−ρkyksTk)+ρksksTk, ρk=1yTksk=HBFGSk−HkyksTk+skyTkHksTkyk+(1+yTkHkyksTkyk)sksTksTkyk, | (3.14) |
where
The update BFGS formula for the Hessian matrix can be generated using the Sherman-Morrison-Woodbury formula. A rank-one-modification (or perturbation)
M−1=A−1−(1+c∗A−1b)−1A−1bc∗A−1. | (3.15) |
As a result, the following update for
BBFGSk+1=BBFGSk−BksksTkBksTkBksk+ykyTksTkyk. | (3.16) |
The weighted combinations of DFP and BFGS updates give the whole update class, which is known as the Broyden class. This class of update is defined by
Hϕk+1=(1−ϕ)HDFPk+1+ϕHBFGSk+1, | (3.17) |
where
Hϕk+1=HDFPk+1+ϕυkυTk=HBFGSk+1+(ϕ−1)υkυTk=Hk+sksTksTkyk−HkykyTkHkyTkHkyk+ϕυkυTk, | (3.18) |
where
υk=(yTkHkyk)1/2[sksTkyk−HkykyTkHkyk]. | (3.19) |
If we put in (3.18):
The Broyden class of methods can be derived directly from the quasi-Newton equation. Consider the general formula for updating rank 2, which contains
Hk+1=Hk+asksTk+b(HkyksTk+skyTkHk)+cHky′kyTkHk, | (3.20) |
where
1=asTkyk+byTkHkyk,0=1+bsTkyk+cyTkHkyk. | (3.21) |
Here we have two equations with three unknowns, so we can introduce the replacement
b=−ϕ/sTkyk, |
where
Hϕk+1=Hk+sksTksTkyk−HkykyTkHkyTkHkyk+ϕυkυTk=HDFPk+1+ϕυkυTk, |
where
A great effort has been invested to discover QN methods that do not merely possess convergence but it is also better from the BFGS update in the numerical performance. Table 1 shows some of these modifications of the quasi-Newton equations.
Quasi-Newton Eqs. | Ref. | |
[104] | ||
[72] | ||
[123] | ||
[133] | ||
[134] | ||
[59] |
Also, it is important to state spectral gradient method. Therein the updating of the formula for
Bk+1=diag(r(i)k) | (3.22) |
with
r(i)k=11+∑ni=1(ˆs(i)k)2−∑ni=1(ˆs(i)k)(ˆy(i)k)∑ni=1(ˆs(i)k)4(ˆs(i)k)2, ˆsk=sk−711sk−1+211sk−2, |
ˆyk=yk−711yk−1+211yk−2. |
The general framework of the QN algorithm is given in Algorithm 3.
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
The general QN iterative rule with line search
xk+1=xk−αkHkgk | (3.24) |
assumes that
According to Brezinski's classification in [15], the structure of updating
Bk=γkI≈Gk, γk>0, | (3.25) |
where
xk+1=xk−γ−1kαkgk. | (3.26) |
Usually, the parameter
Andrei in [4,6] defined iterations
xk+1=xk−θkαkgk. | (3.27) |
Usage of random values of
xAGDk+1=xAGDk−θAGDkαkgAGDk. | (3.28) |
A few modifications of the scheme (3.26) were promoted in [94,96,97,112,114]. The iterations defined in [112] are of the form (3.26), in which
xSMk+1=xSMk−αk(γSMk)−1gSMk, | (3.29) |
where
γSMk+1=2γSMkγSMk[f(xSMk+1)−f(xSMk)]+αk‖gSMk‖2α2k‖gSMk‖2. | (3.30) |
The Double direction and double step-size accelerated methods, termed as ADSS and ADD, respectively, were originated in [94,96].
The next iterations are known as Accelerated double step-size (ADSS) iterations [94]:
xADSSk+1=xADSSk−(αk(γADSSk)−1+lk)gADSSk, | (3.31) |
where
xTADSSk+1=xTADSSk−ψkgTADSSk, |
where
The particular choice
xk+1=xk−γ−1kgk, | (3.32) |
where
γBBk=sTk−1yk−1yTk−1yk−1. | (3.33) |
The symmetric case assumes the minimization
γBBk=sTk−1sk−1sTk−1yk−1. | (3.34) |
The BB iterations are defined using
xBBk+1=xBBk−γBBkgBBk. |
The BB method was improved in a number of research articles, main of which are [22,24,34,35,36,37,106,107,122,137].
Another member of the IGD iterates is the Scalar Correction (SC) method [84], defined in (3.32) by the rule
γSCk+1={sTkrkyTkrk,yTkrk>0,‖sk‖‖yk‖,yTkrk≤0, rk=sk−γkyk. | (3.35) |
Accordingly, the SC iterations are defined by the relation
xSCk+1=xSCk−γSCkgSCk. |
Relaxed BB method by an additional step
A modification of GD method (1.5) was proposed in [63]. It is defined by
xk+1=M(GD)(xk)=xk−(αk+α2k−α3k)gk. | (3.36) |
Further, the next scheme was proposed as the modified SM (MSM) method in [63]:
xMSMk+1=xMSMk−(αk+α2k−α3k)(γMSMk)−1gMSMk. | (3.37) |
The leading principle used in defining the iterations (3.37) is the replacement of
αk≤αk+α2k−α3k≤αk+α2k. |
So, (3.37) is based on a small increase of
The acceleration parameter
γADDk+1=2f(xADDk+1)−f(xADDk)−αk(gADDk)T(αkdADDk−(γADDk)−1gADDk)(αkdADDk−(γADDk)−1gADDk)T(αkdADDk−(γADDk)−1gADDk),(ADD method[96])γADSSk+1=2f(xADSSk+1)−f(xADSSk)+(αk(γADSSk)−1+lk)‖gADSSk‖2(αk(γADSSk)−1+lk)2‖gADSSk‖2,(ADSS method[94])γTADSSk+1=2f(xTADSSk+1)−f(xTADSSk)+ψk‖gTADSSk‖2ψ2k‖gTADSSk‖2,ψk=αk((γTADSSk)−1−1)+1,(TADSS method[114])γMSMk+1=2γMSMkγMSMk[f(xMSMk+1)−f(xMSMk)]+(αk+α2k−α3k)‖gMSMk‖2(αk+α2k−α3k)2‖gMSMk‖2,(MSM method[63]). |
The efficiency of IGD methods was numerically tested in [98].
The author of [41] proposed two Relaxed Gradient Descent Quasi Newton (RGDQN and RGDQN1) iteration rules
xk+1=xk−ξkαkγ−1kgk, | (3.38) |
such that
ξk=γkαkγk+1. |
The following algorithm is known as the SM method and introduced in [112].
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
An application of the Picard-Mann hybrid iterative process from [66] is another possibility to accelerate iterations for solving nonlinear optimization problems. The function
{x1=x∈C,xk+1=Tyk,yk=(1−Υk)xk+ΥkTxk, k∈N. | (3.39) |
The real number
xk+1=H(T)(xk)=T[(1−Υk)xk+ΥkTxk], k∈N. | (3.40) |
The iterates (3.40) are denoted by
In [66] it was proposed a set of constant values
T(xk):=xSMk−(γSMk)−1αkgSMk, |
the iterations (3.40) become the so called HSM iterative rule given as
xHSMk+1:=H(SM)(xk)=xHSMk−(Υk+1)(γHSMk)−1αkgHSMk, | (3.41) |
where
γHSMk+1=2γHSMkγHSMk[f(xHSMk+1)−f(xHSMk)]+(Υk+1)tk‖gHSMk‖2(Υk+1)2t2k‖gHSMk‖2. |
A modified HSM (MHSM) method is defined in [92] by proposing an appropriate initial value in the backtracking procedure.
A hybridization of the ADD method was considered in [99] in the form
xHADDk+1=xHADDk−(Υk+1)tk(γHADDk)−1gHADDk+(Υk+1)t2kdk, |
wherein
γHADDk+1=2f(xHADDk+1)−f(xHADDk)−(Υk+1)(gHADDk)T(t2kdk−tk(γHADDk)−1gHADDk)(Υk+1)2t2k(tkdk−(γHADDk)−1gHADDk)T(tkdk−(γHADDk)−1gHADDk). |
Recently, the hybridization
Nonlinear conjugate gradient (CG) methods form a class of important methods for solving unconstrained nonlinear optimization and solving system of nonlinear equations. Nonlinear CG methods are defined by the line search iterates (1.2) where the search direction
In this article, a review on CG methods for unconstrained optimization is given. Main convergence theorems are provided for the conjugate gradient method assuming the descent property of each search direction. Some research issues on conjugate gradient methods are mentioned.
In [21], the nonlinear CG methods are divided into three classes: early conjugate gradient methods, descent conjugate gradient methods, and sufficient descent conjugate gradient methods. Andrei classified the CG methods in three classes: classical CG methods, scaled CG methods, and hybrid and parameterized CG methods.
The classification presented in this paper divides CG methods into the following categories: basic conjugate gradients methods, considered in Subsection 4.1, Dai-Liao class of methods, presented in Subsection 4.2, hybrid conjugate gradient methods, described in Subsection 4.3, and combined BFGS-CG iterations, in Subsection 4.4.
The CG methods included in Table 2 are known as early or classical conjugate gradient methods.
Title | Year | Reference | |
1952 | [60] | ||
1964 | [48] | ||
1967 | [38] | ||
1969 | [102,103] | ||
1987 | [47] | ||
1991 | [79] | ||
1999 | [30] |
where
In the listed CG methods, the numerator of the update parameter
Denominator | |||
Numerator | |||
FR | DY | CD | |
PRP | HS | LS |
Define the following functions
n1:=‖gk‖2,n2=yTk−1gk,d1:=‖gk−1‖2,d2=yTk−1dk−1,d3=−gTk−1dk−1. |
Then
βFRk=n1d1, βPRPk=n2d1, βDYk=n1d2, βHSk=n2d2, βCDk=n1d3, βLSk=n2d3. |
But, there exist exceptions to these rules. One example is given in [38]
βDk=gTkGk−1dk−1dTk−1Gk−1dk−1(1967). |
From the presented chronological development of the CGUP, we can see that the
For a large-scale unconstrained nonlinear optimization problem, in practice, choices for updating a CG parameter that do not require computation or approximation of the Hessian and its inverse are preferred over methods that require Hessian or its approximation in each iteration.
Wei et al. [124] gave a variant of the PRP method which we call the VPRP method, with the parameter
βVPRPk=‖gk‖2−‖gk ‖‖gk−1 ‖gTkgk−1‖gk−1‖2. |
The VPRP method was extended to a variant of the HS method by Yao et al. in [132],
βVHSk=‖gk‖2−‖gk ‖‖gk−1 ‖gTkgk−1dTk−1yk−1. |
Zhang [138] took a little modification to the VPRP method and constructed the NPRP method as follows,
βNPRPk=‖gk‖2−‖gk ‖‖gk−1 ‖|gTkgk−1|‖gk−1‖2. |
Moreover, Zhang [138] extended this result to the HS method and proposed the NHS method as follows,
βNHSk=‖gk‖2−‖gk ‖‖gk−1 ‖|gTkgk−1|dTk−1yk−1. |
Recently, Wei et al. [125] proposed a variation of the FR method which we call the VFR method. the parameter
βVFRk=μ1‖gk‖2μ2|gTkdk−1|+μ3‖gk−1‖2, |
where
βDPRPk=‖gk‖2−‖gk ‖‖gk−1 ‖|gTkgk−1|μ|gTkdk−1|+‖gk−1‖2, μ>1. |
Recently, Wei et al. [18] gave a variant of the PRP method which we call the WYL method, that is,
βWYLk=gTk(gk−‖gk‖‖gk−1‖gk−1)‖gk−1‖2. |
The WYL method was extended to a variant of the LS method by Yao et al. [132], that is,
βMLSk=gTk(gk−‖gk‖‖gk−1‖gk−1)−gTkdk−1. |
Also, the following function will be useful:
N1=gTk(gk−‖gk‖‖gk−1‖gk−1)=n1−‖gk‖‖gk−1‖gTkgk−1=gTk^yk−1N2=n1−‖gk‖‖gk−1‖|gTkgk−1|. |
Then
βWYLk=N1d1,βVPRPk=N1d1,βVHSk=N1d2βNPRPk=N2d1,βNHSk=N2d2. |
Some particular CG variants are
βDHSk=‖gk‖2−‖gk ‖‖gk−1 ‖|gTkgk−1|μ|gTkdk−1|+dTk−1yk−1 (2012)βDLSk=‖gk‖2−‖gk ‖‖gk−1 ‖|gTkgk−1|μ|gTkdk−1|−dTk−1gk−1, μ>1 (2017). | (4.1) |
If the functions
D1(μ)=μ|gTkdk−1|+dTk−1yk−1,D2(μ)=μ|gTkdk−1|−dTk−1gk−1 |
are defined, then
βDHSk=N2D1(μ),βDLSk=N2D2(μ). |
An extension of the conjugacy condition
dTkyk−1=0 | (4.2) |
was studied by Perry [93]. Perry tried to incorporate the second-order information of the objective function into the CG method to accelerate it. Specifically, by using the secant condition and the search direction of the QN methods, which are respectively defined by
Bksk−1=yk−1andBkdk=−gk, | (4.3) |
the following relation is obtained:
dTkyk−1=dTk(Bksk−1)=(Bkdk)Tsk−1=−gTksk−1, | (4.4) |
where
dTkyk−1=−gTksk−1. | (4.5) |
Furthermore, Dai and Liao [26] included a nonnegative parameter
dTkyk−1=−tgTksk−1. | (4.6) |
In order to find the search direction
βDLk=gTkyk−1dTk−1yk−1−tgTksk−1dTk−1yk−1=gTk(yk−1−tsk−1)dTk−1yk−1, t>0. | (4.7) |
Expression (4.7) for defining
βDLk=yTk−1gkyTk−1dk−1−tgTksk−1dTk−1yk−1=βHSk−tgTksk−1dTk−1yk−1, (2001) | (4.8) |
βDHSDLk=‖gk‖2−‖gk ‖‖gk−1 ‖|gTkgk−1|μ|gTkdk−1|+dTk−1yk−1−tgTksk−1dTk−1yk−1=βDHSk−tgTksk−1dTk−1yk−1, (2017) | (4.9) |
βDLSDLk=‖gk‖2−‖gk ‖‖gk−1 ‖|gTkgk−1|μ|gTkdk−1|−dTk−1gk−1−tgTksk−1dTk−1yk−1=βDLSk−tgTksk−1dTk−1yk−1, (2017) | (4.10) |
βMHSDLk=‖gk‖2−‖gk ‖‖gk−1 ‖gTkgk−1dTk−1yk−1−tgTksk−1dTk−1yk−1=gTk^yk−1dTk−1yk−1−tgTksk−1dTk−1yk−1, (2013) | (4.11) |
where
Clearly
Some additional CG methods from the DL class are
βMLSDLk=gTk^yk−1−dTk−1gk−1−tgTksk−1dTk−1yk−1 | (4.12) |
βZZDLk=gTkzk−1zTk−1dk−1−tgTksk−1dTk−1zk−1, | (4.13) |
where
In order to characterize this family of CG methods, define the mapping
F(βMk,t)=βMk−tgTksk−1dTk−1yk−1. |
Then
βDLk=F(βHSk,t),βDHSDLk=F(βDHSk,t), |
βDLSDLk=F(βDLSk,t),βMHSDLk=F(βDHSk,t). |
The research for the best values of the parameter
tk=2‖yk−1‖2yTk−1sk−1. |
Babaie-Kafaki and Ghanbari [9] presented two appropriate choices of the parameter
tk=sTk−1yk−1‖sk−1‖2+‖yk−1‖‖sk−1‖ |
and
tk=‖yk−1‖‖sk−1‖. |
Andrei in [5] suggested the following value for t in (4.7) which becomes a variant of the DL method, denoted by DLE:
tk=sTk−1yk−1‖sk−1‖2. | (4.14) |
Lotfi and Hosseini in [81] discovered the most recent approximations of the parameter
In the subsequent sections, we will survey recent advances in CG methods. Two main research streams can be observed: the algorithms which improve the scalar parameter
Hybrid CG methods can be segmented into two classes: mixed methods as well as methods combined together by introducing one or more parameters.
The following hybrid CG method was suggested in [121]:
βk={βPRPk,if0≤βPRPk≤βFRk,βFRk,otherwise. | (4.15) |
When a jam in iterations occurs again, the PRP update parameter is used. Hu and Storey in [61] had a similar motivation and suggested the following rule
βk=max{0,min{βPRPk,βFRk}}. | (4.16) |
In [49] it is pointed out that
βk=max{−βFRk,min{βPRPk,βFRk}}. | (4.17) |
Dai and Yuan [31] combined the DY method with other CG methods, which leads to the following CGUP parameters:
βk=max{0,min{βHSk,βDYk}} | (4.18) |
and
βk=max{−cβDYk,min{βHSk,βDYk}}, c=1−σ1+σ. | (4.19) |
In [28], they tested different CG methods for large-scale unconstrained optimization problems and concluded that the hybrid CG method (4.18) gave the best results.
Next hybrid CG method, proposed by Dai in [23], employs either the DY scheme or the CD scheme:
βk=‖gk‖2max{dTk−1yk−1,−gTk−1dk−1}. | (4.20) |
A modified CG method defined as the hybridization of known LS and CD conjugate gradient methods is presented and analyzed in [130] by the rule
βLSCDk=max{0,min{βLSk,βCDk}}. | (4.21) |
CG methods can be combined together by introducing one or more parameters. In [32,33], Dai and Yuan proposed a one-parameter family of CG methods with
βk=‖gk‖2θk‖gk−1‖2+(1−θk)dTk−1yk−1, | (4.22) |
where
Another hybrid method, proposed by Delladji, Belloufi and Sellami in [39], exploits either the PRP scheme or the HZ scheme, as
βhPRPHZk=θkβPRPk+(1−θk)βHZk=θkyTk−1gk‖gk−1‖2+(1−θk)1dTk−1yk−1(yk−1−2dk−1‖yk−1‖2dTk−1yk−1)Tgk, | (4.23) |
in which
Nazareth in [89] proposed a two-parameter family of CGUP parameters using convex combinations of numerators and denominators, as
βk=νk‖gk‖2+(1−νk)gTkyk−1θk‖gk−1‖2+(1−θk)dTk−1yk−1, | (4.24) |
where
In [113], the authors proposed hybrid CG methods where the search direction
dk:=D(βk,gk,dk−1)=−(1+βkgTkdk−1‖gk‖2)gk+βkdk−1, | (4.25) |
dk:=D1(βk,gk,dk−1)=−Bkgk+D(βk,gk,dk−1), | (4.26) |
and
βLSCDk=max{0,min{βLSk,βCDk}},dk=d(βLSCDk,gk,dk−1). | (4.27) |
The resulting method is known as the MLSCD method with the search direction
dk:=D(βLSCDk,gk,dk−1). | (4.28) |
In general, the idea is based on the replacement of
Now we give the general framework of the CG class of methods.
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
The first iteration in CG methods is a gradient step. Also, it is common to restart the algorithm periodically by taking the gradient step.
Known fact is that CG iterates are better than QN methods in terms of the CPU time. Moreover, BFGS updates require greater memory space usage than CG. On the other hand, the QN methods require lesser number of iterations as well as the number of function evaluations. For this purpose, one of the modern trends in defining new CG methods is usage of the BFGS update in defining new rules for defining
dk:={−Bkgk,k=0,D1(βLSCDk,gk,dk−1),k≥1. |
In [113], the authors investigated hybrid CG algorithms based on the modified search direction which is defined using one of the following two hybridizations:
dk:=D(βk,gk,dk−1)=−(1+βkgTkdk−1‖gk‖2)gk+βkdk−1, | (4.29) |
dk:=D1(βk,gk,dk−1)=−Bkgk+D(βk,gk,dk−1), | (4.30) |
as well as on the usage of
dk:={−Bkgk,k=0,D1(βLSCDk,gk,dk−1),k≥1. |
Below we give some assumptions related to line search procedures.
Assumption 5.1.
‖g(u)−g(v)‖≤L‖u−v‖,∀u,v∈N. | (5.1) |
Assumption 5.1 initiates the existence of a positive constants
‖u−v‖≤D,∀u,v∈N | (5.2) |
and
‖g(u)‖≤γ,∀u∈N. | (5.3) |
The proof of Lemma 5.1 is given in [142] and known as the Zoutendijk condition.
Lemma 5.1. [17,142] Let Assumption 5.1 be accomplished and the points
∞∑k=0‖gk‖4‖dk‖2<+∞. | (5.4) |
In this subsection, we recall properties of the HS, PRP and LS methods. If we look at the chronological development presented in Table 2, a clear observation is that HS, PRP and LS methods involve the expression
Property (*) [49] Let a method defined by (1.2) and (1.10) satisfies
0<γ≤‖gk‖≤ˉγ, | (5.5) |
for all
|βk|≤b, | (5.6) |
and
‖sk−1‖≤λ⇒|βk|≤12b. | (5.7) |
In order to prove that conjugate gradient methods have Property (*), it suffices to show that there exists a constant
|βk|≤c‖sk−1‖for allk, | (5.8) |
under the assumption (5.5). Then, by putting
‖sk−1‖≤λ⇒|βk|≤12b, | (5.9) |
which confirms the Property (*). It is easily shown that (5.8) holds for the HS, PRP and LS methods, and thus these methods have Property (*).
Next, we give the global convergence theorem of CG methods satisfying Property (*). The proof of Theorem 5.2 is given in [49].
Theorem 5.2. Consider any conjugate gradient method (1.2), (1.10) that satisfies the following conditions:
If the Lipschitz and Boundedness Assumptions hold, then the iterates are globally convergent.
In this section, we recall properties of the FR, CD and DY methods. If we look at the chronological development presented in Table 2, it is observable that FR, CD and DY methods involve the value
Proposition 1. The following statements hold:
−11−σ1≤gTkdk‖gk‖2≤−1+σ21−σ1. | (5.10) |
−11−σ1≤gTkdk‖gk‖2≤−11+σ2. | (5.11) |
−1−σ1≤gTkdk‖gk‖2≤−1+σ2. | (5.12) |
Proposition 1 implies that the FR, CD and DY methods satisfy the sufficient descent condition (1.6), dependent on line searches.
We now give the global convergence properties of the FR and DY methods, which were proven in [2] and [30], respectively.
Theorem 5.3. Suppose that Assumption 5.1 holds. Let the sequence
lim infk→∞‖gk‖=0. | (5.13) |
Methods surveyed in Subsection 5.2 exhibit strong convergence properties, but they may not be efficient in practice due to appearance of jamming. On the other hand however, methods given in Subsection 5.1 may not be convergent in general, they often perform better than the methods restated in Subsection 5.2. Details about this fact are given in [65]. This clearly implies that combinations of these methods have been proposed with the aim of exploiting attractive features of each family of methods.
The following assumptions will be commonly used in the subsequent convergence analysis of DHSDL, DLSDL, MHSDL and MLSDL methods.
It is supposed that the conditions in Assumption 5.1 hold. Assumption 5.1 initiates the existence of positive constants
By the uniform convexity of
(g(u)−g(v))T(u−v)≥θ‖u−v‖2,for allu,v∈S, | (5.14) |
or equivalently,
f(u)≥f(v)+g(v)T(u−v)+θ2‖u−v‖2,for allu,v∈S. | (5.15) |
It follows from (5.14) and (5.15) that
sTk−1yk−1≥θ‖sk−1‖2 | (5.16) |
and
fk−1−fk≥−gTksk−1+θ2‖sk−1‖2. | (5.17) |
By (5.1) and (5.16), we have
θ‖sk−1‖2≤sTk−1yk−1≤L‖sk−1‖2, | (5.18) |
where the inequalities imply
Further, (5.18) implies
sTk−1yk−1=αk−1dTk−1yk−1>0. | (5.19) |
From
dTk−1yk−1>0. | (5.20) |
In order to improve presentation, an arbitrary method defined by (1.2) and (1.10) will be denoted by
βk=β′k−tgTksk−1dTk−1yk−1. | (5.21) |
Lemma 5.4. [17,142] Let Assumption 5.1 be accomplished and the points
Lemma 5.5. The parameters
0≤β′k≤‖gk‖2λ|gTkdk−1|, λ≥1 | (5.22) |
in each iterative step
Proof. In the case
0≤βDHSk, βDLSk≤‖gk‖2μ|gTkdk−1| |
are known from [29].
For
0≤βMHSk=‖gk‖2−‖gk ‖‖gk−1‖|gTkgk−1|dTk−1yk−1≤‖gk‖2|gTkdk−1|. |
For
0≤βMLSk=‖gk‖2−‖gk ‖‖gk−1‖|gTkgk−1|−dTk−1gk−1≤‖gk‖2|gTkdk−1|. |
Since
Lemma 5.6. The iterations
gTkdk≤−c‖gk‖2 | (5.23) |
for some
Proof. The inequality (5.23) will be verified by induction. In the initial situation
gTkdk=−‖gk‖2+βkgTkdk−1. | (5.24) |
An application of (5.21) and
gTkdk=−‖gk‖2+(β′k−tgTksk−1dTk−1yk−1)gTkdk−1=−‖gk‖2+β′kgTkdk−1−tαk−1gTkdk−1dTk−1yk−1gTkdk−1=−‖gk‖2+β′kgTkdk−1−tαk−1(gTkdk−1)2dTk−1yk−1. |
From (5.20),
tαk−1(gTkdk−1)2dTk−1yk−1>0. | (5.25) |
Now (5.25) in conjunction with (5.22) implies
gTkdk≤−‖gk‖2+β′kgTkdk−1≤−‖gk‖2+‖gk‖2λ|gTkdk−1||gTkdk−1|≤−‖gk‖2+‖gk‖2λ=−(1−1λ)‖gk‖2. |
In view of
Lemma 5.7. The parameter
βk≤‖gk‖2−‖gk‖‖gk−1‖|gTkgk−1|−tgTksk−1dTk−1yk−1. | (5.26) |
Proof. For
βDHSDLk=‖gk‖2−‖gk‖‖gk−1‖|gTkgk−1|μ|gTkdk−1|+dTk−1yk−1−tgTksk−1dTk−1yk−1≤‖gk‖2−‖gk‖‖gk−1‖|gTkgk−1|dTk−1yk−1−tgTksk−1dTk−1yk−1=‖gk‖2−‖gk‖‖gk−1‖|gTkgk−1|−tgTksk−1dTk−1yk−1. | (5.27) |
As
dTk−1yk−1=dTk−1(gk−gk−1)=dTk−1gk−dTk−1gk−1≤|dTk−1gk|−dTk−1gk−1≤μ|dTk−1gk|−dTk−1gk−1. | (5.28) |
Using inequalities (5.20) and (5.28) in
βDLSDLk=‖gk‖2−‖gk ‖‖gk−1‖|gTkgk−1|μ|gTkdk−1|−dTk−1gk−1−tgTksk−1dTk−1yk−1≤‖gk‖2−‖gk ‖‖gk−1‖|gTkgk−1|dTk−1yk−1−tgTksk−1dTk−1yk−1=‖gk‖2−‖gk‖‖gk−1‖|gTkgk−1|−tgTksk−1dTk−1yk−1. | (5.29) |
The following conclusion is valid in the case
βMHSDLk=‖gk‖2−‖gk ‖‖gk−1‖|gTkgk−1|dTk−1yk−1−tgTksk−1dTk−1yk−1=‖gk‖2−‖gk‖‖gk−1‖|gTkgk−1|−tgTksk−1dTk−1yk−1. | (5.30) |
Therefore, the inequality (5.26) follows from (5.27), (5.29) and (5.30).
The global convergence of the proposed methods is confirmed by Theorem 5.8.
Theorem 5.8. Assume that Assumption 5.1 is true and
lim infk→∞‖gk‖=0. | (5.31) |
Proof. Assume that (5.31) is not true. This implies the existence of a constant
‖gk‖≥c1,for allk. | (5.32) |
Squaring both sides of (1.10) implies
‖dk‖2=‖gk‖2−2βkgTkdk−1+(βk)2‖dk−1‖2. | (5.33) |
Taking into account (5.22), one can obtain
−2βkgTkdk−1=−2(β′k−tgTksk−1dTk−1yk−1)gTkdk−1=−2(β′kgTkdk−1−tαk(gTkdk−1)2dTk−1yk−1). | (5.34) |
Now from (5.25), with respect to
−2βkgTkdk−1≤2|β′k||gTkdk−1|≤2‖gk‖2λ|gTkdk−1||gTkdk−1|≤2‖gk‖2λ. | (5.35) |
Case 1. In the cases
βk≤‖gk‖2−‖gk‖‖gk−1‖|gTkgk−1|−tgTksk−1dTk−1yk−1≤|gTkgk−‖gk‖‖gk−1‖|gTkgk−1|−tgTksk−1dTk−1yk−1|≤|gTk(gk−‖gk‖‖gk−1‖gk−1−tsk−1)|θαk−1‖dk−1‖2=|gTk(gk−gk−1+gk−1−‖gk‖‖gk−1‖gk−1−tsk−1)|θαk−1‖dk−1‖2≤‖gk‖(‖gk−gk−1‖+‖gk−1(1−‖gk‖‖gk−1‖)‖+t‖sk−1‖)θαk−1‖dk−1‖2≤‖gk‖(‖gk−gk−1‖+|1−‖gk‖‖gk−1‖|‖gk−1‖+t‖sk−1‖)θαk−1‖dk−1‖2≤‖gk‖(‖gk−gk−1‖+|‖gk−1‖−‖gk‖|+t‖sk−1‖)θαk−1‖dk−1‖2≤‖gk‖(‖gk−gk−1‖+‖gk−1−gk‖+t‖sk−1‖)θαk−1‖dk−1‖2 | (5.36) |
So,
βk≤‖gk‖(2‖gk−gk−1‖+t‖sk−1‖)θαk−1‖dk−1‖2≤‖gk‖(2L‖sk−1‖+t‖sk−1‖)θαk−1‖dk−1‖2≤(2L+t)‖gk‖‖sk−1‖θαk−1‖dk−1‖2 | (5.37) |
=(2L+t)‖gk‖αk−1‖dk−1‖θαk−1‖dk−1‖2=(2L+t)‖gk‖θ‖dk−1‖. |
Using (5.35) and (5.37) in (5.33), we obtain
‖dk‖2≤‖gk‖2+2‖gk‖2λ+(2L+t)2‖gk‖2θ2‖dk−1‖2‖dk−1‖2≤‖gk‖2+2‖gk‖2λ+(2L+t)2θ2‖gk‖2≤(1+2λ+(2L+t)2θ2)‖gk‖2≤(λ+2λ+(2L+t)2θ2)‖gk‖2≤(λ+2)θ2+λ(2L+t)2λθ2‖gk‖2. | (5.38) |
Next, dividing both sides of (5.38) by
‖dk‖2‖gk‖4≤(λ+2)θ2+λ(2L+t)2λθ2⋅1c21‖gk‖4‖dk‖2≥λθ2⋅c21(λ+2)θ2+λ(2L+t)2. | (5.39) |
The inequalities in (5.39) imply
∞∑k=0‖gk‖4‖dk‖2≥∞∑k=0λθ2⋅c21(λ+2)θ2+λ(2L+t)2=∞. | (5.40) |
Therefore,
Case 2. In the cases
βMLSDLk≤|‖gk‖2−‖gk‖‖gk−1‖|gTkgk−1|−dTk−1gk−1−tgTksk−1dTk−1yk−1|≤|gTkgk−‖gk‖‖gk−1‖|gTkgk−1|−dTk−1gk−1|+t|gTksk−1||dTk−1yk−1|≤|gTk(gk−‖gk‖‖gk−1‖gk−1)||−dTk−1gk−1|+t|gTksk−1||dTk−1yk−1|=|gTk(gk−gk−1+gk−1−‖gk‖‖gk−1‖gk−1)||−dTk−1gk−1|+t|gTksk−1||dTk−1yk−1| |
≤‖gk‖(‖gk−gk−1‖+‖gk−1−‖gk‖‖gk−1‖gk−1‖)|−dTk−1gk−1|+t‖gk‖‖sk−1‖|dTk−1yk−1|≤‖gk‖(‖gk−gk−1‖+‖gk−1(1−‖gk‖‖gk−1‖)‖)|−dTk−1gk−1|+t‖gk‖‖sk−1‖|dTk−1yk−1|≤‖gk‖(‖gk−gk−1‖+|1−‖gk‖‖gk−1‖|‖gk−1‖)|−dTk−1gk−1|+t‖gk‖‖sk−1‖|dTk−1yk−1|≤‖gk‖(‖gk−gk−1‖+|‖gk−1‖−‖gk‖|)|−dTk−1gk−1|+t‖gk‖‖sk−1‖|dTk−1yk−1|≤‖gk‖(‖gk−gk−1‖+‖gk−gk−1‖)|−dTk−1gk−1|+t‖gk‖‖sk−1‖|dTk−1yk−1|≤2⋅‖gk‖‖gk−gk−1‖c‖gk−1‖2+t‖gk‖‖sk−1‖|dTk−1yk−1|≤2L⋅‖gk‖‖sk−1‖c‖gk−1‖2+t‖gk‖‖sk−1‖|dTk−1yk−1|. | (5.41) |
From (5.18), (5.19) and (5.41), we conclude
βMLSDLk≤2L⋅‖gk‖‖sk−1‖c‖gk−1‖2+t‖gk‖‖sk−1‖θ‖sk−1‖2=2L⋅‖gk‖‖sk−1‖c‖gk−1‖2+t‖gk‖θ‖sk−1‖≤2L⋅‖gk‖‖sk−1‖c⋅c21+t‖gk‖θ‖sk−1‖≤2Lθ⋅‖gk‖‖sk−1‖2+c⋅c21⋅t‖gk‖c⋅c21⋅θ‖sk−1‖≤(2Lθ⋅D2+c⋅c21⋅t)‖gk‖c⋅c21⋅θ⋅αk−1‖dk−1‖. | (5.42) |
Replacement of (5.35) and (5.42) in (5.33) leads to
‖dk‖2≤‖gk‖2+2‖gk‖2λ+(2Lθ⋅D2+c⋅c21⋅t)2‖gk‖2(c⋅c21⋅θ⋅αk−1)2‖dk−1‖2‖dk−1‖2=‖gk‖2+2‖gk‖2λ+(2Lθ⋅D2+c⋅c21⋅t)2(c⋅c21⋅θ⋅αk−1)2‖gk‖2=(1+2λ+(2Lθ⋅D2+c⋅c21⋅t)2(c⋅c21⋅θ⋅αk−1)2)‖gk‖2=(λ+2λ+(2Lθ⋅D2+c⋅c21⋅t)2(c⋅c21⋅θ⋅αk−1)2)‖gk‖2=(λ+2)(c⋅c21⋅θ⋅αk−1)2+λ(2Lθ⋅D2+c⋅c21⋅t)2λ(c⋅c21⋅θ⋅αk−1)2‖gk‖2. | (5.43) |
Next, dividing both sides of (5.43) by
‖dk‖2‖gk‖4≤(λ+2)(c⋅c21⋅θ⋅αk−1)2+λ(2Lθ⋅D2+c⋅c21⋅t)2λ(c⋅c21⋅θ⋅αk−1)2⋅1c21‖gk‖4‖dk‖2≥λ(c⋅c21⋅θ⋅αk−1)2⋅c21(λ+2)(c⋅c21⋅θ⋅αk−1)2+λ(2Lθ⋅D2+c⋅c21⋅t)2. | (5.44) |
The inequalities in (5.44) imply
∞∑k=0‖gk‖4‖dk‖2≥∞∑k=0λ(c⋅c21⋅θ⋅αk−1)2⋅c21(λ+2)(c⋅c21⋅θ⋅αk−1)2+λ(2Lθ⋅D2+c⋅c21⋅t)2=∞. | (5.45) |
Therefore,
The code used in the testing experiments is written in the software Matlab R2017a, and executed on the personal computer Workstation Intel Core i3
The numerical experiments are performed on contains functions presented in [3], where much of the problems are taken over from CUTEr collection [14]. Each test function is tested
Strong Wolfe line search use the following choice of parameters for all algorithms
We utilized the performance profile given in [43] to compare numerical results (IT, FE and CPU) for all tested methods. The upper curve of the selected performance profile corresponds to the method that shows the best performance.
BFGS, DFP, SR1 updates in QN methods with respect to different ILS strategies are compared in [40]. The main conclusion is that the BFGS method is superior to the others. Continuing such research, we compare the numerical performances obtained from AGD, MSM and SM methods, i.e, gradient methods with acceleration parameter. The numerical experiment contains
The uniform stopping criteria in this numerical experiments are
‖gk‖≤10−6 and |fk+1−fk|1+|fk|≤10−16. |
Summary numerical data generated by AGD, MSM and SM method, tried on 25 test functions, are arranged in Table 4.
IT profile | FE profile | CPU time | |||||||
Test function | AGD | MSM | SM | AGD | MSM | SM | AGD | MSM | SM |
Perturbed Quadratic | 353897 | 34828 | 59908 | 13916515 | 200106 | 337910 | 6756.047 | 116.281 | 185.641 |
Raydan 1 | 22620 | 26046 | 14918 | 431804 | 311260 | 81412 | 158.359 | 31.906 | 36.078 |
Diagonal 3 | 120416 | 7030 | 12827 | 4264718 | 38158 | 69906 | 5527.844 | 52.609 | 102.875 |
Generalized Tridiagonal 1 | 670 | 346 | 325 | 9334 | 1191 | 1094 | 11.344 | 1.469 | 1.203 |
Extended Tridiagonal 1 | 3564 | 1370 | 4206 | 14292 | 10989 | 35621 | 55.891 | 29.047 | 90.281 |
Extended TET | 443 | 156 | 156 | 3794 | 528 | 528 | 3.219 | 0.516 | 0.594 |
Diagonal 4 | 120 | 96 | 96 | 1332 | 636 | 636 | 0.781 | 0.203 | 0.141 |
Extended Himmelblau | 396 | 260 | 196 | 6897 | 976 | 668 | 1.953 | 0.297 | 0.188 |
Perturbed quadratic diagonal | 2542050 | 37454 | 44903 | 94921578 | 341299 | 460028 | 44978.750 | 139.625 | 185.266 |
Quadratic QF1 | 366183 | 36169 | 62927 | 13310016 | 208286 | 352975 | 12602.563 | 81.531 | 138.172 |
Extended quadratic penalty QP1 | 210 | 369 | 271 | 2613 | 2196 | 2326 | 1.266 | 1.000 | 0.797 |
Extended quadratic penalty QP2 | 395887 | 1674 | 3489 | 9852040 | 11491 | 25905 | 3558.734 | 3.516 | 6.547 |
Quadratic QF2 | 100286 | 32727 | 64076 | 3989239 | 183142 | 353935 | 1582.766 | 73.438 | 132.703 |
Extended quadratic exponential EP1 | 48 | 100 | 73 | 990 | 894 | 661 | 0.750 | 0.688 | 0.438 |
Extended Tridiagonal 2 | 1657 | 659 | 543 | 8166 | 2866 | 2728 | 3.719 | 1.047 | 1.031 |
ARWHEAD (CUTE) | 5667 | 430 | 270 | 214284 | 5322 | 3919 | 95.641 | 1.969 | 1.359 |
Almost Perturbed Quadratic | 356094 | 33652 | 60789 | 14003318 | 194876 | 338797 | 13337.125 | 73.047 | 133.516 |
LIARWHD (CUTE) | 1054019 | 3029 | 18691 | 47476667 | 27974 | 180457 | 27221.516 | 9.250 | 82.016 |
ENGVAL1 (CUTE) | 743 | 461 | 375 | 6882 | 2285 | 2702 | 3.906 | 1.047 | 1.188 |
QUARTC (CUTE) | 171 | 217 | 290 | 402 | 494 | 640 | 2.469 | 1.844 | 2.313 |
Generalized Quartic | 187 | 181 | 189 | 849 | 493 | 507 | 0.797 | 0.281 | 0.188 |
Diagonal 7 | 72 | 147 | 108 | 333 | 504 | 335 | 0.625 | 0.547 | 0.375 |
Diagonal 8 | 60 | 120 | 118 | 304 | 383 | 711 | 0.438 | 0.469 | 0.797 |
Full Hessian FH3 | 45 | 63 | 63 | 1352 | 566 | 631 | 1.438 | 0.391 | 0.391 |
Diagonal 9 | 329768 | 10540 | 13619 | 13144711 | 68189 | 89287 | 6353.172 | 43.609 | 38.672 |
Table 4 contains numerical results corresponding to IT, FE and CPU criteria for the AGD, MSM and SM methods. Figures 1, 2, 3 illustrate the performance profiles corresponding to the results in Table 4 corresponding to the criterion IT, GE and CPU, respectively.
From Table 4, we conclude that the AGD, MSM and SM methods have successfully solved all test functions.
Figure 1 presents the performance profiles of the IT of the AGD, MSM and SM methods. In this figure, it is observable that MSM method is best in
From Figure 1, it is observable that the graph of the MSM method comes first to the top, which means that the MSM is superior compared to other considered methods with respect to the IT profile.
Figure 2 presents the performance profiles of the FE of the AGD, MSM and SM methods. It is observable that MSM method is best in
Figure 3 presents the performance profiles of the CPU of the AGD, MSM and SM methods. It is obvious that MSM is winer in
From the previous analysis of the results shown in Table 4 and Figures 1-3, we can conclude that the MSM iterates are most efficient in terms of all three basic metrics: IT, FE and CPU. The MSM method has the smallest IT, FE and the CPU time compared to the other two methods on the most test functions.
The uniform stopping criterion during testing CG methods is
‖gk‖≤ϵ, |
where
In this subsection, we compare the numerical results obtained from HS, PRP and LS methods, i.e., methods with
IT profile | FE profile | CPU time | |||||||
Test function | HS | PRP | LS | HS | PRP | LS | HS | PRP | LS |
Perturbed Quadratic | 1157 | 1157 | 6662 | 3481 | 3481 | 19996 | 0.234 | 0.719 | 1.438 |
Raydan 2 | NaN | 174 | 40 | NaN | 373 | 120 | NaN | 0.094 | 0.078 |
Diagonal 2 | NaN | 1721 | 5007 | NaN | 6594 | 15498 | NaN | 1.313 | 2.891 |
Extended Tridiagonal 1 | NaN | 170 | 17079 | NaN | 560 | 54812 | NaN | 0.422 | 13.641 |
Diagonal 4 | NaN | 70 | 1927 | NaN | 180 | 5739 | NaN | 0.078 | 0.391 |
Diagonal 5 | NaN | 154 | 30 | NaN | 338 | 90 | NaN | 0.172 | 0.078 |
Extended Himmelblau | 160 | 120 | 241 | 820 | 600 | 1043 | 0.172 | 0.125 | 0.172 |
Full Hessian FH2 | 5096 | 5686 | 348414 | 15294 | 17065 | 1045123 | 83.891 | 80.625 | 5081.875 |
Perturbed quadratic diagonal | 1472 | 1120 | 21667 | 4419 | 3363 | 65057 | 0.438 | 0.391 | 2.547 |
Quadratic QF1 | 1158 | 1158 | 5612 | 3484 | 3484 | 16813 | 0.281 | 0.313 | 1.047 |
Extended quadratic penalty QP2 | NaN | 533 | NaN | NaN | 5395 | NaN | NaN | 0.781 | NaN |
Quadratic QF2 | 2056 | 2311 | NaN | 9168 | 9862 | NaN | 0.969 | 0.859 | NaN |
Extended quadratic exponential EP1 | NaN | NaN | 70 | NaN | NaN | 350 | NaN | NaN | 0.141 |
TRIDIA (CUTE) | 6835 | 6744 | NaN | 20521 | 20248 | NaN | 1.438 | 1.094 | NaN |
Almost Perturbed Quadratic | 1158 | 1158 | 5996 | 3484 | 3484 | 17998 | 0.281 | 0.328 | 1.063 |
LIARWHD (CUTE) | NaN | 408 | 11498 | NaN | 4571 | 50814 | NaN | 0.438 | 2.969 |
POWER (CUTE) | 7781 | 7789 | 190882 | 23353 | 23377 | 572656 | 1.422 | 1.219 | 14.609 |
NONSCOMP (CUTE) | 4545 | 3647 | NaN | 15128 | 12433 | NaN | 0.875 | 0.656 | NaN |
QUARTC (CUTE) | NaN | 165 | 155 | NaN | 1347 | 1466 | NaN | 0.781 | 0.766 |
Diagonal 6 | NaN | 174 | 137 | NaN | 373 | 442 | NaN | 0.109 | 0.125 |
DIXON3DQ (CUTE) | NaN | 12595 | 12039 | NaN | 37714 | 36091 | NaN | 1.641 | 2.859 |
BIGGSB1 (CUTE) | NaN | 11454 | 11517 | NaN | 34293 | 34530 | NaN | 1.969 | 2.141 |
Generalized Quartic | NaN | 134 | 139 | NaN | 458 | 445 | NaN | 0.125 | 0.094 |
Diagonal 7 | NaN | 51 | 80 | NaN | 142 | 240 | NaN | 0.063 | 0.109 |
Diagonal 8 | NaN | 70 | 80 | NaN | 180 | 180 | NaN | 0.063 | 0.125 |
FLETCHCR (CUTE) | 18292 | 19084 | 20354 | 178305 | 170266 | 171992 | 8.859 | 6.203 | 7.484 |
Table 5 shows the numerical results (IT, FE and CPU) for the HS, PRP and LS methods.
Figures 4, 5 and 6 plot the performance profiles for the results in Table 5 with respect to IT, FE and CPU criterion, respectively.
Figure 4 presents the performance profiles of the IT correspondig to the HS, PRP and LS methods. In this figure, it is observable that PRP method is best in
Figure 5 presents the performance profiles of the FE of the HS, PRP and LS methods. It is observable that PRP method is best in
Figure 6 presents the performance profiles of the CPU of the HS, PRP and LS methods. It is obvious that PRP is best in
From the previous analysis of the results shown in Table 5 and Figures 4-6, we can conclude that the PRP method achieved the best and most efficient results in terms of all three basic metrics: IT, FE and CPU.
In this subsection, we compare the numerical results obtained from DY, FR and CD methods, i.e, methods with
IT profile | FE profile | CPU time | |||||||
Test function | DY | FR | CD | DY | FR | CD | DY | FR | CD |
Perturbed Quadratic | 1157 | 1157 | 1157 | 3481 | 3481 | 3481 | 0.469 | 0.609 | 0.531 |
Raydan 2 | 86 | 40 | 40 | 192 | 100 | 100 | 0.063 | 0.016 | 0.016 |
Diagonal 2 | 1636 | 3440 | 2058 | 4774 | 7982 | 8063 | 0.922 | 1.563 | 1.297 |
Extended Tridiagonal 1 | 2081 | 690 | 1140 | 4639 | 2022 | 2984 | 1.703 | 1.141 | 1.578 |
Diagonal 4 | 70 | 70 | 70 | 200 | 200 | 200 | 0.047 | 0.031 | 0.016 |
Diagonal 5 | 40 | 124 | 155 | 100 | 258 | 320 | 0.109 | 0.141 | 0.125 |
Extended Himmelblau | 383 | 339 | 207 | 1669 | 1467 | 961 | 0.219 | 0.172 | 0.172 |
Full Hessian FH2 | 4682 | 4868 | 4794 | 14054 | 14610 | 14390 | 65.938 | 66.469 | 65.922 |
Perturbed quadratic diagonal | 1036 | 1084 | 1276 | 3114 | 3258 | 3834 | 0.406 | 0.422 | 0.422 |
Quadratic QF1 | 1158 | 1158 | 1158 | 3484 | 3484 | 3484 | 0.297 | 0.297 | 0.328 |
Quadratic QF2 | NaN | NaN | 2349 | NaN | NaN | 10073 | NaN | NaN | 1.531 |
Extended quadratic exponential EP1 | NaN | 60 | 60 | NaN | 310 | 310 | NaN | 0.109 | 0.125 |
Almost Perturbed Quadratic | 1158 | 1158 | 1158 | 3484 | 3484 | 3484 | 0.422 | 0.453 | 0.391 |
LIARWHD (CUTE) | 2812 | 1202 | 1255 | 12366 | 7834 | 7379 | 0.938 | 1.000 | 1.109 |
POWER (CUTE) | 7779 | 7781 | 7782 | 23347 | 23353 | 23356 | 1.078 | 1.500 | 1.328 |
NONSCOMP (CUTE) | 2558 | 13483 | 10901 | 49960 | 43268 | 33413 | 1.203 | 1.406 | 1.422 |
QUARTC (CUTE) | 134 | 94 | 95 | 1132 | 901 | 916 | 0.688 | 0.672 | 0.563 |
Diagonal 6 | 86 | 40 | 40 | 192 | 100 | 100 | 0.047 | 0.063 | 0.063 |
DIXON3DQ (CUTE) | 16047 | 18776 | 19376 | 48172 | 56369 | 58176 | 2.266 | 2.516 | 2.734 |
BIGGSB1 (CUTE) | 15274 | 17835 | 18374 | 45853 | 53546 | 55170 | 2.875 | 2.922 | 2.484 |
Generalized Quartic | 142 | 214 | 173 | 497 | 712 | 589 | 0.078 | 0.172 | 0.109 |
Diagonal 7 | 50 | 50 | 50 | 160 | 160 | 160 | 0.063 | 0.047 | 0.094 |
Diagonal 8 | 50 | 40 | 40 | 160 | 130 | 130 | 0.109 | 0.125 | 0.063 |
Full Hessian FH3 | 43 | 43 | 43 | 139 | 139 | 139 | 0.063 | 0.109 | 0.109 |
FLETCHCR (CUTE) | NaN | NaN | 26793 | NaN | NaN | 240237 | NaN | NaN | 10.203 |
Table 6 contains numerical results (IT, FE and CPU) for the DY, FR and CD methods. Figures 7, 8 and 9 plot the performance profiles for the results in Table 6 with respect to profiles IT, FE and CPU, respectively.
From Figure 7, it is observable that the graph of the CD method comes first to the top, which signifies that the CD outperforms other considered methods with respect to the IT.
Figure 8 presents the performance profiles of the FE of the DY, FR and CD methods. From Figure 8, it is observed that the CD graph first comes to the highest level, which means that the CD possesses best performances with respect to the criterion FE.
Figure 9 presents the performance profiles of the CPU of the DY, FR and CD methods. Figure 9 demonstrates that the graph of the CD method first achieves the top level, so that the CD is winer with respect to the CPU.
From the previous analysis of the results shown in Table 6 and Figures 7-9, it is clear that the CD method achieved most efficient results in terms of all three basic metrics: IT, FE and CPU.
This subsection analyses numerical results obtained by running a MATLAB implementation with predefined conditions given at the beginning section. The following ten hybrid CG methods in the form of (1.2) and (1.10), which differ only in the choice of the CG parameter
- HCG1: The CG method with
- HCG2: The CG method with
- HCG3: The CG method with
- HCG4: The CG method with
- HCG5: The CG method with
- HCG6: The CG method with
- HCG7: The CG method with the parameter
- HCG8: The CG method with the parameter
- HCG9: The CG method with the parameter
- HCG10: The CG method with the parameter
The numerical experiment contains
Summary numerical results for hybrid CG methods, tried on 25 test functions, with respect to IT, FE and CPU profiles are presented in Table 7.
Test function | HCG1 | HCG2 | HCG3 | HCG4 | HCG5 | HCG6 | HCG7 | HCG8 | HCG9 | HCG10 |
Perturbed Quadratic | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 |
Raydan 2 | 40 | 40 | 40 | 57 | 78 | 81 | 40 | 69 | NaN | 126 |
Diagonal 2 | 1584 | 1581 | 1542 | 1488 | 1500 | 2110 | 2193 | 1843 | 1475 | 1453 |
Extended Tridiagonal 1 | 805 | 623 | 754 | 2110 | 2160 | 10129 | 1167 | 966 | NaN | 270 |
Diagonal 4 | 60 | 60 | 70 | 60 | 70 | 70 | 60 | 70 | NaN | 113 |
Diagonal 5 | 124 | 39 | 98 | 39 | 120 | 109 | 39 | 141 | 154 | 130 |
Extended Himmelblau | 145 | 139 | 111 | 161 | 181 | 207 | 159 | 381 | 109 | 108 |
Full Hessian FH2 | 5036 | 5036 | 5036 | 4820 | 4820 | 4800 | 4994 | 4789 | 5163 | 5705 |
Perturbed quadratic diagonal | 1228 | 1214 | 1266 | 934 | 1093 | 987 | 996 | 1016 | NaN | 2679 |
Quadratic QF1 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | NaN | 1158 |
Quadratic QF2 | 2125 | 2098 | 2174 | 1995 | 1991 | 2425 | 2378 | NaN | 2204 | 2034 |
TRIDIA (CUTE) | NaN | NaN | NaN | 6210 | 6210 | 5594 | NaN | NaN | 6748 | 7345 |
Almost Perturbed Quadratic | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 |
LIARWHD (CUTE) | 1367 | 817 | 1592 | 1024 | 1831 | 1774 | 531 | 2152 | NaN | 573 |
POWER (CUTE) | 7782 | 7782 | 7782 | 7779 | 7779 | 7802 | 7781 | 7780 | NaN | 7781 |
NONSCOMP (CUTE) | 10092 | 10746 | 8896 | 10466 | 9972 | 13390 | 11029 | 3520 | 3988 | 11411 |
QUARTC (CUTE) | 94 | 160 | 145 | 150 | 126 | 95 | 160 | 114 | 165 | 154 |
Diagonal 6 | 40 | 40 | 40 | 57 | 78 | 81 | 40 | 69 | NaN | 126 |
DIXON3DQ (CUTE) | 12182 | 5160 | 11257 | 5160 | 11977 | 14302 | 5160 | 17080 | NaN | 12264 |
BIGGSB1 (CUTE) | 10664 | 5160 | 10479 | 5160 | 11082 | 13600 | 5160 | 16192 | NaN | 11151 |
Generalized Quartic | 129 | 107 | 110 | 107 | 142 | 153 | 107 | 123 | 131 | 145 |
Diagonal 7 | 50 | NaN | 40 | NaN | 40 | 50 | NaN | 50 | 51 | 40 |
Diagonal 8 | 40 | 40 | 40 | 50 | NaN | 50 | 40 | NaN | NaN | 40 |
Full Hessian FH3 | 43 | 42 | 42 | 42 | 42 | 43 | 42 | 43 | NaN | NaN |
FLETCHCR (CUTE) | 17821 | 17632 | 18568 | 17272 | 17446 | 26794 | 24865 | NaN | 17315 | 20813 |
Figure 10 plot corresponding performance profiles IT for the results included in Table 7, in three columns denoted by IT.
From Figure 10, it is observable that the graph of the HCG6 method comes first to the top, which signifies that the HCG6 outperforms other considered methods with respect to the IT. However, if we look in more detail Figure 10, we can see that the HCG6 method does not have the best results, it has the best results in only
The numerical results of the hybrid CG methods with respect to the FE are arranged in Table 8.
Test function | HCG1 | HCG2 | HCG3 | HCG4 | HCG5 | HCG6 | HCG7 | HCG8 | HCG9 | HCG10 |
Perturbed Quadratic | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 |
Raydan 2 | 100 | 100 | 100 | 134 | 176 | 182 | 100 | 158 | NaN | 282 |
Diagonal 2 | 6136 | 6217 | 6006 | 5923 | 5944 | 8281 | 8594 | 4822 | 5711 | 5636 |
Extended Tridiagonal 1 | 2369 | 1991 | 2275 | 4678 | 4924 | 22418 | 3119 | 2661 | NaN | 869 |
Diagonal 4 | 170 | 170 | 200 | 170 | 200 | 200 | 170 | 200 | NaN | 339 |
Diagonal 5 | 258 | 88 | 206 | 88 | 270 | 228 | 88 | 292 | 338 | 270 |
Extended Himmelblau | 855 | 687 | 583 | 763 | 813 | 961 | 757 | 1613 | 567 | 594 |
Full Hessian FH2 | 15115 | 15115 | 15115 | 14467 | 14467 | 14407 | 14989 | 14374 | 15495 | 17122 |
Perturbed quadratic diagonal | 3686 | 3647 | 3805 | 2805 | 3282 | 2967 | 2993 | 3053 | NaN | 8044 |
Quadratic QF1 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | NaN | 3484 |
Quadratic QF2 | 9455 | 9202 | 9501 | 9016 | 9054 | 10229 | 10086 | NaN | 9531 | 9085 |
TRIDIA (CUTE) | NaN | NaN | NaN | 18640 | 18640 | 16792 | NaN | NaN | 20260 | 22051 |
Almost Perturbed Quadratic | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 |
LIARWHD (CUTE) | 7712 | 5931 | 8275 | 6165 | 8113 | 9395 | 5854 | 10305 | NaN | 4848 |
POWER (CUTE) | 23356 | 23356 | 23356 | 23347 | 23347 | 23416 | 23353 | 23350 | NaN | 23353 |
NONSCOMP (CUTE) | 31355 | 33211 | 27801 | 32705 | 31458 | 40807 | 34013 | 23411 | 13367 | 35106 |
QUARTC (CUTE) | 901 | 1254 | 1261 | 1224 | 1224 | 916 | 1254 | 1041 | 1347 | 1305 |
Diagonal 6 | 100 | 100 | 100 | 134 | 176 | 182 | 100 | 158 | NaN | 282 |
DIXON3DQ (CUTE) | 36508 | 15534 | 33759 | 15534 | 35926 | 42952 | 15534 | 51284 | NaN | 36796 |
BIGGSB1 (CUTE) | 31960 | 15534 | 31427 | 15534 | 33247 | 40846 | 15534 | 48620 | NaN | 33469 |
Generalized Quartic | 457 | 371 | 370 | 371 | 481 | 529 | 371 | 439 | 446 | 467 |
Diagonal 7 | 160 | NaN | 130 | NaN | 130 | 160 | NaN | 160 | 142 | 13 |
Diagonal 8 | 130 | 130 | 130 | 160 | NaN | 160 | 130 | NaN | NaN | 130 |
Full Hessian FH3 | 139 | 136 | 136 | 136 | 136 | 139 | 136 | 139 | NaN | NaN |
FLETCHCR (CUTE) | 166463 | 165774 | 168739 | 175309 | 175845 | 240240 | 184939 | NaN | 174406 | 215687 |
Figure 11 plots the performance profiles for the results in Table 8, in three columns denoted by FE.
From Figure 11, it is observable that the graph of the HCG6 method comes first to the top, which signifies that the HCG6 outperforms other considered methods with respect to the FE. However, if we look in more detail Figure 11, we can see an identical situation as in Figure 10 that the HCG6 method does not have the best results, it has the best results in only
Table 9 contains numerical results of the hybrid CG methods with respect to the CPU.
Test function | HCG1 | HCG2 | HCG3 | HCG4 | HCG5 | HCG6 | HCG7 | HCG8 | HCG9 | HCG10 |
Perturbed Quadratic | 0.656 | 0.516 | 0.781 | 0.719 | 0.594 | 0.438 | 0.719 | 0.688 | 0.844 | 0.688 |
Raydan 2 | 0.031 | 0.063 | 0.078 | 0.078 | 0.078 | 0.078 | 0.078 | 0.078 | NaN | 0.078 |
Diagonal 2 | 1.453 | 1.328 | 1.656 | 1.172 | 1.438 | 1.797 | 1.813 | 1.266 | 1.250 | 1.141 |
Extended Tridiagonal 1 | 1.016 | 1.125 | 1.359 | 2.250 | 2.375 | 7.578 | 1.672 | 1.375 | NaN | 0.922 |
Diagonal 4 | 0.031 | 0.031 | 0.031 | 0.078 | 0.078 | 0.047 | 0.109 | 0.094 | NaN | 0.094 |
Diagonal 5 | 0.141 | 0.063 | 0.156 | 0.094 | 0.094 | 0.125 | 0.109 | 0.078 | 0.219 | 0.156 |
Extended Himmelblau | 0.172 | 0.172 | 0.109 | 0.141 | 0.172 | 0.141 | 0.125 | 0.141 | 0.172 | 0.125 |
Full Hessian FH2 | 83.125 | 91.938 | 86.984 | 85.766 | 94.484 | 78.281 | 77.141 | 74.500 | 80.969 | 82.469 |
Perturbed quadratic diagonal | 0.406 | 0.609 | 0.641 | 0.375 | 0.563 | 0.359 | 0.328 | 0.344 | NaN | 0.734 |
Quadratic QF1 | 0.359 | 0.438 | 0.422 | 0.422 | 0.406 | 0.391 | 0.484 | 0.422 | NaN | 0.281 |
Quadratic QF2 | 1.047 | 1.313 | 1.203 | 1.156 | 1.063 | 1.156 | 1.000 | NaN | 1.094 | 1.047 |
TRIDIA (CUTE) | NaN | NaN | NaN | 1.688 | 1.391 | 1.859 | NaN | NaN | 1.875 | 1.391 |
Almost Perturbed Quadratic | 0.406 | 0.438 | 0.516 | 0.594 | 0.250 | 0.359 | 0.406 | 0.578 | 0.641 | 0.422 |
LIARWHD (CUTE) | 0.938 | 0.828 | 1.203 | 0.797 | 1.125 | 1.172 | 0.938 | 1.203 | NaN | 0.594 |
POWER (CUTE) | 1.563 | 1.672 | 1.750 | 1.609 | 1.625 | 1.578 | 1.625 | 1.188 | NaN | 1.453 |
NONSCOMP (CUTE) | 1.547 | 1.484 | 1.063 | 1.766 | 1.422 | 1.719 | 1.516 | 1.063 | 1.203 | 1.703 |
QUARTC (CUTE) | 0.750 | 1.000 | 0.969 | 0.969 | 0.875 | 0.797 | 0.938 | 0.703 | 1.266 | 0.93 |
Diagonal 6 | 0.078 | 0.078 | 0.078 | 0.094 | 0.063 | 0.016 | 0.016 | 0.125 | NaN | 0.109 |
DIXON3DQ (CUTE) | 2.047 | 1.453 | 2.016 | 1.484 | 2.359 | 2.234 | 1.406 | 2.297 | NaN | 2.078 |
BIGGSB1 (CUTE) | 1.875 | 2.047 | 2.359 | 1.750 | 2.250 | 2.391 | 1.422 | 2.672 | NaN | 2.422 |
Generalized Quartic | 0.063 | 0.125 | 0.141 | 0.156 | 0.125 | 0.094 | 0.078 | 0.109 | 0.172 | 0.109 |
Diagonal 7 | 0.063 | NaN | 0.016 | NaN | 0.109 | 0.063 | NaN | 0.063 | 0.063 | 0.063 |
Diagonal 8 | 0.078 | 0.125 | 0.078 | 0.031 | NaN | 0.063 | 0.109 | NaN | NaN | 0.078 |
Full Hessian FH3 | 0.063 | 0.047 | 0.109 | 0.047 | 0.031 | 0.063 | 0.047 | 0.109 | NaN | NaN |
FLETCHCR (CUTE) | 5.656 | 6.750 | 7.922 | 9.484 | 6.484 | 8.766 | 7.281 | NaN | 6.906 | 7.547 |
Figure 12 plots the performance profiles for the results in Tables 9.
From Figure 12, it is observable that the graph of the HCG6 method comes first to the top, which signifies that the HCG6 outperforms other considered methods with respect to the CPU. However, if we look in more detail Figure 12, we can see an identical situation as in the figures 10 and 11 that the HCG6 method does not have the best results, it has the best results in only
The numerical experiments presented in this subsection investigate the influence of the scalar size in the modified Dai-Liao methods. The previously mentioned variants of the DL method use a fixed value of the parameter
That is why we come to the next question. What is the 'best' value of
As we have already indicated in the previous section, the aim of this subsection is to answer to the question: what is the 'best' value of
- Does and how much the values of the scalar
- Does the choice of
To give an answer to these questions, we would have to test all the methods under the same conditions. During testing, we will compare all the considered methods with the same values of required scalars. The Algorithm 2, i.e. the backtracking line search, determines the step-size
Algorithm 6 gives the corresponding general framework for DHSDL, DLSDL, MHSDL and MLSDL methods.
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Values of
Label | T1 | T2 | T3 | T4 | T5 | T6 |
Value of the scalar |
0.05 | 0.1 | 0.2 | 0.5 | 0.9 |
The reason why we decided to define values
-
-
The convergence of the above methods has already considered in the mentioned references. Our goal is to give a unified analysis of the convergence of proposed DL methods. The aim of our research is to investigate the influence of the scalar
Each particular value of the scalar
The convergence of the above methods has already considered in the mentioned references. Our goal is to give a unified analysis of the convergence of the proposed DL methods. The aim of our research is to investigate the influence of the scalar
Arranged numerical results are generated by testing and comparing DHSDL, DLSDL, MHSDL and MLSDL methods on
The uniform stopping criteria are (refer to previous)
‖gk‖≤10−6 and |fk+1−fk|1+|fk|≤10−16. |
The backtracking parameters for all algorithms are
During the testing of the DHSDL and DLSDL methods, the constant parameter
Figure 13 indicates the performance profiles of the IT criterion with respect to the DHSDL method depending on the scalar value
Figure 14 shows the performance profile given by FE of the DHSDL solver with respect to
Figure 15 illustrates the CPU criterion spanned by the DHSDL method depending on the metrics
The graphs displayed in figures 13–15 show that the DHSDL method has achieved superior results for
Figure 16 illustrates the IT criterion in the DLSDL method depending on the scalar value
Figure 17 shows the performance profiles of the criterion FE corresponding to the DLSDL method and the scalar value
Figure 18 shows the performance profiles of the CPU time of the DLSDL method depending on
Based on figures (16–18) analysis, we come to the conclusion that the DLSDL method has achieved best responses for
In Subsection 6.5.1, we have defined two questions. Our aim in this subsection is to give answers. In order to answer properly to the first question, in Tables 11, 12, 13 are collected average values for all three considered criteria.
Method | T1 | T2 | T3 | T4 | T5 | T6 |
DHSDL | 32980.14 | 31281.32 | 33640.45 | 32942.36 | 34448.32 | 33872.36 |
DLSDL | 30694.00 | 28701.14 | 31048.32 | 30594.77 | 31926.59 | 31573.05 |
MHSDL | 29289.73 | 27653.64 | 29660.00 | 29713.50 | 30491.18 | 30197.27 |
MLSDL | 25398.82 | 22941.77 | 24758.27 | 24250.68 | 25722.64 | 25032.64 |
Method | T1 | T2 | T3 | T4 | T5 | T6 |
DHSDL | 1228585.50 | 1191960.55 | 1252957.09 | 1238044.36 | 1271176.59 | 1255710.45 |
DLSDL | 1131421.41 | 1083535.14 | 1149482.41 | 1134315.00 | 1167030.14 | 1158554.77 |
MHSDL | 1089700.41 | 1036710.32 | 1089777.64 | 1091985.41 | 1105299.91 | 1101380.18 |
MLSDL | 904217.14 | 845017.55 | 891669.50 | 879473.14 | 913165.68 | 895652.36 |
Method | T1 | T2 | T3 | T4 | T5 | T6 |
DHSDL | 902.06 | 894.73 | 917.77 | 930.56 | 911.28 | 870.93 |
DLSDL | 816.08 | 790.63 | 804.69 | 816.28 | 803.84 | 809.67 |
MHSDL | 770.78 | 751.65 | 728.61 | 749.70 | 712.64 | 720.57 |
MLSDL | 573.14 | 587.41 | 581.50 | 576.32 | 582.62 | 580.96 |
After the numerical testing of the compared methods and the individual analysis for each method, we can now give a global conclusion of the behavior of the observed methods. The first conclusion is that the value of the scalar
Overview of QN methods, CG methods and their classification are presented. Section 5 investigates convergence properties of CG methods, following the CG classes in accordance with the presented taxonomy of basic CG methods. Numerical experiments compare main classes of QN and CG methods. More precisely, main QN methods with constant diagonal Hessian approximation are compared as well as two classes of basic CG methods, hybrid CG methods, and finally some variants of modified Dai-Liao methods.
The problem of defining further improvements of the CGUP
One of prospective fields for further research includes a generalization of the discrete-time approach to continuous-time approach, considered in [69]. Another possibility for further research is extension of gradient methods to tensor case. This possibility was exploited in [77] on solving
Moreover, an application of low rank updates used in optimization can be transferred to appropriate numerical methods for computing generalized inverses. Applications of rank-one update formula are investigated in [115,116].
Predrag Stanimirović and Dijana Mosić are supported by the Ministry of Education, Science and Technological Development, Grants No. 174013 and 174007. Haifeng Ma is supported by the National Natural Science Foundation of China under grant 11971136 and the bilateral project between China and and Poland (no. 37-18).
[1] |
A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional neural networks, Commun. ACM, 60 (2017), 84-90. https://doi.org/10.1145/3065386 doi: 10.1145/3065386
![]() |
[2] |
M. Frid-Adar, I. Diamant, E. Klang, M. Amitai, J. Goldberger, H. Greenspan, GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification, Neurocomputing, 321 (2018), 321-331. https://doi.org/10.1016/j.neucom.2018.09.013 doi: 10.1016/j.neucom.2018.09.013
![]() |
[3] | B. Bhattarai, S. Baek, R. Bodur, T. K. Kim, Sampling strategies for GAN synthetic data, ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (2020), 2303-2307. IEEE. https://doi.org/10.1109/ICASSP40776.2020.9054677 |
[4] | M. Mirza, S. Osindero, Conditional generative adversarial nets, arXiv: 1411.1784, 2014. |
[5] |
J. J. Bird, C. M. Barnes, L. J. Manso, A. Ekárt, D. R. Faria, Fruit quality and defect image classification with conditional GAN data augmentation, Scientia Horticulturae, 293 (2022), 110684. https://doi.org/10.1016/j.scienta.2021.110684 doi: 10.1016/j.scienta.2021.110684
![]() |
[6] |
V. A. Fajardo, D. Findlay, C. Jaiswal, X. Yin, R. Houmanfar, H. Xie, H., et al., On oversampling imbalanced data with deep conditional generative models, Expert Syst. Appl., 169 (2021), 114463. https://doi.org/10.1016/j.eswa.2020.114463 doi: 10.1016/j.eswa.2020.114463
![]() |
[7] | T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, X. Chen, Improved techniques for training GANs, Advances in neural information processing systems, 29 (2016). |
[8] | V. Dumoulin, I. Belghazi, B. Poole, O. Mastropietro, A. Lamb, M. Arjovsky, et al., Adversarially learned inference, arXiv: 1606.00704, 2016. |
[9] | A. Odena, Semi-supervised learning with generative adversarial networks, arXiv: 1606.01583, 2016. |
[10] |
C. Li, T. Xu, J. Zhu, B. Zhang, Triple generative adversarial nets, Advances in neural information processing systems, 30 (2017). https://doi.org/10.1007/978-3-319-70139-4 doi: 10.1007/978-3-319-70139-4
![]() |
[11] |
A. Haque, Ec-gan: Low-sample classification using semi-supervised algorithms and GANs, Proceedings of the AAAI Conference on Artificial Intelligence, 35 (2021), 15797-15798. https://doi.org/10.1609/aaai.v35i18.17895 doi: 10.1609/aaai.v35i18.17895
![]() |
[12] | T. Chen, S. Kornblith, M. Norouzi, G. Hinton, A simple framework for contrastive learning of visual representations, International conference on machine learning, (2020), 1597-1607. PMLR. |
[13] | X. Chen, K. He, Exploring simple siamese representation learning, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2021), 15750-15758. https://doi.org/10.1109/CVPR46437.2021.01549 |
[14] |
K. He, X. Chen, S. Xie, Y. Li, P. Dollár, R. Girshick, Masked autoencoders are scalable vision learners, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2022), 16000-16009. https://doi.org/10.1109/CVPR52688.2022.01553 doi: 10.1109/CVPR52688.2022.01553
![]() |
[15] | A. Odena, C. Olah, J. Shlens, Conditional image synthesis with auxiliary classifier GANs, International conference on machine learning, 70 (2017), 2642-2651. |
[16] |
I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, et al., Generative adversarial networks, Commun. ACM, 63 (2020), 139-144. https://doi.org/10.1145/3422622 doi: 10.1145/3422622
![]() |
[17] | A. Radford, L. Metz, S. Chintala, Unsupervised representation learning with deep convolutional generative adversarial networks, arXiv: 1511.06434, 2015. |
[18] | A. Brock, J. Donahue, K. Simonyan, Large scale GAN training for high fidelity natural image synthesis, arXiv preprint arXiv: 1809.11096, 2018. |
[19] | X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, P. Abbeel, Infogan: Interpretable representation learning by information maximizing generative adversarial nets, Advances in neural information processing systems, 29 (2016). |
[20] | J. Donahue, P. Krä henbühl, T. Darrell, Adversarial feature learning, arXiv: 1605.09782, 2016. |
[21] |
T. Karras, S. Laine, T. Aila, A style-based generator architecture for generative adversarial networks, Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, (2019), 4401-4410. https://doi.org/10.1109/CVPR.2019.00453 doi: 10.1109/CVPR.2019.00453
![]() |
[22] |
J. Y. Zhu, T. Park, P. Isola, A. A. Efros, Unpaired image-to-image translation using cycle-consistent adversarial networks, Proceedings of the IEEE international conference on computer vision, (2017), 2223-2232. https://doi.org/10.1109/ICCV.2017.244 doi: 10.1109/ICCV.2017.244
![]() |
[23] | G. Perarnau, J. Van De Weijer, B. Raducanu, J. M. Á lvarez, Invertible conditional GANs for image editing, arXiv preprint arXiv: 1611.06355, 2016. |
[24] | G. L. Grinblat, L. C. Uzal, P. M. Granitto, Class-splitting generative adversarial networks, arXiv preprint arXiv: 1709.07359, 2017. |
[25] | S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, H. Lee, Generative adversarial text to image synthesis, International conference on machine learning, (2016), 1060-1069. |
[26] |
H. Zhang, T. Xu, H. Li, X. Zhang, X. Wang, X. Huang, et al., Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks, Proceedings of the IEEE international conference on computer vision, (2017), 5907-5915. https://doi.org/10.1109/ICCV.2017.629 doi: 10.1109/ICCV.2017.629
![]() |
[27] | T. Kim, M. Cha, H. Kim, J. K. Lee, J. Kim, Learning to discover cross-domain relations with generative adversarial networks, International conference on machine learning, (2017), 1857-1865. |
[28] | V. Dumoulin, J. Shlens, M. Kudlur, A learned representation for artistic style, arXiv: 1610.07629, 2016. |
[29] |
M. Patel, X. Wang, S. Mao, Data augmentation with conditional GAN for automatic modulation classification, Proceedings of the 2nd ACM Workshop on Wireless Security and Machine Learning, (2020), 31-36. https://doi.org/10.1145/3395352.3402622 doi: 10.1145/3395352.3402622
![]() |
[30] | D. H. Lee, Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks, Workshop on challenges in representation learning, ICML, 3 (2013), 896. |
[31] | Y. Grandvalet, Y. Bengio, Semi-supervised learning by entropy minimization, Advances in neural information processing systems, 17 (2004). |
[32] |
K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, Proceedings of the IEEE conference on computer vision and pattern recognition, (2016), 770-778. https://doi.org/10.1109/CVPR.2016.90 doi: 10.1109/CVPR.2016.90
![]() |
[33] |
N. V. Chawla, K. W. Bowyer, L. O. Hall, W. P. Kegelmeyer, SMOTE: synthetic minority over-sampling technique, J. Artif. Intell. Res., 16 (2002), 321-357. https://doi.org/10.1613/jair.953 doi: 10.1613/jair.953
![]() |
[34] | Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, A. Y. Ng, Reading digits in natural images with unsupervised feature learning, NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011. |
[35] | D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv: 14126980, 2014. |
1. | Yiyuan Qian, Kai Zhang, Jingzhi Li, Xiaoshen Wang, Adaptive neural network surrogate model for solving the implied volatility of time-dependent American option via Bayesian inference, 2022, 30, 2688-1594, 2335, 10.3934/era.2022119 | |
2. | Branislav Ivanov, Gradimir V. Milovanović, Predrag S. Stanimirović, Accelerated Dai-Liao projection method for solving systems of monotone nonlinear equations with application to image deblurring, 2023, 85, 0925-5001, 377, 10.1007/s10898-022-01213-4 | |
3. | Krzysztof Gdawiec, Wiesław Kotarski, Agnieszka Lisowska, On the robust Newton’s method with the Mann iteration and the artistic patterns from its dynamics, 2021, 104, 0924-090X, 297, 10.1007/s11071-021-06306-5 | |
4. | Mohd Rizman Sultan Mohd, Juliana Johari, Fazlina Ahmat Ruslan, Noorfadzli Abdul Razak, Salmiah Ahmad, Ahmad Syahiman Mohd Shah, 2022, Chapter 47, 978-981-16-8514-9, 619, 10.1007/978-981-16-8515-6_47 | |
5. | Jingxin Lin, Nianfeng Wang, Hao Zhu, Xianmin Zhang, Xuewei Zheng, 2021, Fabric defect detection based on multi-input neural network, 978-1-6654-3153-8, 458, 10.1109/M2VIP49856.2021.9665111 | |
6. | Yapeng Li, Xiangzhen Wang, Wenjie Cheng, Songyang Gao, Chuntian Cheng, A combination approach for downstream plants to solve scheduling information asymmetry problem in electricity markets, 2023, 149, 01420615, 108935, 10.1016/j.ijepes.2022.108935 | |
7. | Mei Liu, Xiaoyan Zhang, Mingsheng Shang, Computational Neural Dynamics Model for Time-Variant Constrained Nonlinear Optimization Applied to Winner-Take-All Operation, 2022, 18, 1551-3203, 5936, 10.1109/TII.2021.3138794 | |
8. | Huiping Cao, Xiaomin An, Jing Han, Solving nonlinear equations with a direct Broyden method and its acceleration, 2022, 1598-5865, 10.1007/s12190-022-01818-8 | |
9. | Tina Mai, Daniele Mortari, Theory of functional connections applied to quadratic and nonlinear programming under equality constraints, 2022, 406, 03770427, 113912, 10.1016/j.cam.2021.113912 | |
10. | Saman Babaie-Kafaki, A survey on the Dai–Liao family of nonlinear conjugate gradient methods, 2023, 57, 0399-0559, 43, 10.1051/ro/2022213 | |
11. | Branislav Ivanov, Gradimir V. Milovanović, Predrag S. Stanimirović, Aliyu Muhammed Awwal, Lev A. Kazakovtsev, Vladimir N. Krutikov, Xian-Ming Gu, A Modified Dai–Liao Conjugate Gradient Method Based on a Scalar Matrix Approximation of Hessian and Its Application, 2023, 2023, 2314-4785, 1, 10.1155/2023/9945581 | |
12. | Sanaz Bojari, Mahmoud Paripour, Xian-Ming Gu, Solving Large-Scale Unconstrained Optimization Problems with an Efficient Conjugate Gradient Class, 2024, 2024, 2314-4785, 1, 10.1155/2024/5548724 | |
13. | Diego Armando Perez-Rosero, Andrés Marino Álvarez-Meza, Cesar German Castellanos-Dominguez, A Regularized Physics-Informed Neural Network to Support Data-Driven Nonlinear Constrained Optimization, 2024, 13, 2073-431X, 176, 10.3390/computers13070176 | |
14. | Naleli Jubert Matjelo, Sekhants'o Paul Lara, Moruti Kao, Limakatso Lepekola, Mampesi Thato Matobako, Madan Singh, 2024, Double Decay Parameter Estimation with Autoregressive and Differential Linear Regression, 979-8-3503-9591-4, 1, 10.1109/ICECET61485.2024.10698202 | |
15. | O. B. Onuoha, R. O. Ayinla, G. Egenti, O. E. Taiwo, Exploring Enhanced Conjugate Gradient Methods: A Novel Family of Techniques for Efficient Unconstrained Minimization, 2024, 2581-8147, 773, 10.34198/ejms.14424.773791 | |
16. | Long Jin, Lin Wei, Xin Lv, 2025, Chapter 4, 978-3-031-68593-4, 99, 10.1007/978-3-031-68594-1_4 | |
17. | Nur Idalisa, Norhaslinda Zullpakkal, Mohd Rivaie, Nurul ’Aini, Nurul Hajar, Wan Khadijah, Nurul Hafawati Fadhilah, 2024, 3080, 0094-243X, 030012, 10.1063/5.0192308 | |
18. | Elena Tovbis, Vladimir Krutikov, Predrag Stanimirović, Vladimir Meshechkin, Aleksey Popov, Lev Kazakovtsev, A Family of Multi-Step Subgradient Minimization Methods, 2023, 11, 2227-7390, 2264, 10.3390/math11102264 | |
19. | João Miguel Ferreira, Optimal control policies for a non-eruptive population of rodents—The relevance of migration, 2023, 484, 03043800, 110477, 10.1016/j.ecolmodel.2023.110477 | |
20. | Peng Yu, Shuping Tan, Jin Guo, Yong Song, Data-driven optimal controller design for sub-satellite deployment of tethered satellite system, 2024, 32, 2688-1594, 505, 10.3934/era.2024025 | |
21. | Jamilu Sabi’u, Sekson Sirisubtawee, An inertial Dai-Liao conjugate method for convex constrained monotone equations that avoids the direction of maximum magnification, 2024, 70, 1598-5865, 4319, 10.1007/s12190-024-02123-2 | |
22. | Chunming Tang, Wancheng Tan, Yongshen Zhang, Zhixian Liu, An accelerated spectral CG based algorithm for optimization techniques on Riemannian manifolds and its comparative evaluation, 2025, 462, 03770427, 116482, 10.1016/j.cam.2024.116482 |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Quasi-Newton Eqs. | Ref. | |
[104] | ||
[72] | ||
[123] | ||
[133] | ||
[134] | ||
[59] |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Title | Year | Reference | |
1952 | [60] | ||
1964 | [48] | ||
1967 | [38] | ||
1969 | [102,103] | ||
1987 | [47] | ||
1991 | [79] | ||
1999 | [30] |
Denominator | |||
Numerator | |||
FR | DY | CD | |
PRP | HS | LS |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
IT profile | FE profile | CPU time | |||||||
Test function | AGD | MSM | SM | AGD | MSM | SM | AGD | MSM | SM |
Perturbed Quadratic | 353897 | 34828 | 59908 | 13916515 | 200106 | 337910 | 6756.047 | 116.281 | 185.641 |
Raydan 1 | 22620 | 26046 | 14918 | 431804 | 311260 | 81412 | 158.359 | 31.906 | 36.078 |
Diagonal 3 | 120416 | 7030 | 12827 | 4264718 | 38158 | 69906 | 5527.844 | 52.609 | 102.875 |
Generalized Tridiagonal 1 | 670 | 346 | 325 | 9334 | 1191 | 1094 | 11.344 | 1.469 | 1.203 |
Extended Tridiagonal 1 | 3564 | 1370 | 4206 | 14292 | 10989 | 35621 | 55.891 | 29.047 | 90.281 |
Extended TET | 443 | 156 | 156 | 3794 | 528 | 528 | 3.219 | 0.516 | 0.594 |
Diagonal 4 | 120 | 96 | 96 | 1332 | 636 | 636 | 0.781 | 0.203 | 0.141 |
Extended Himmelblau | 396 | 260 | 196 | 6897 | 976 | 668 | 1.953 | 0.297 | 0.188 |
Perturbed quadratic diagonal | 2542050 | 37454 | 44903 | 94921578 | 341299 | 460028 | 44978.750 | 139.625 | 185.266 |
Quadratic QF1 | 366183 | 36169 | 62927 | 13310016 | 208286 | 352975 | 12602.563 | 81.531 | 138.172 |
Extended quadratic penalty QP1 | 210 | 369 | 271 | 2613 | 2196 | 2326 | 1.266 | 1.000 | 0.797 |
Extended quadratic penalty QP2 | 395887 | 1674 | 3489 | 9852040 | 11491 | 25905 | 3558.734 | 3.516 | 6.547 |
Quadratic QF2 | 100286 | 32727 | 64076 | 3989239 | 183142 | 353935 | 1582.766 | 73.438 | 132.703 |
Extended quadratic exponential EP1 | 48 | 100 | 73 | 990 | 894 | 661 | 0.750 | 0.688 | 0.438 |
Extended Tridiagonal 2 | 1657 | 659 | 543 | 8166 | 2866 | 2728 | 3.719 | 1.047 | 1.031 |
ARWHEAD (CUTE) | 5667 | 430 | 270 | 214284 | 5322 | 3919 | 95.641 | 1.969 | 1.359 |
Almost Perturbed Quadratic | 356094 | 33652 | 60789 | 14003318 | 194876 | 338797 | 13337.125 | 73.047 | 133.516 |
LIARWHD (CUTE) | 1054019 | 3029 | 18691 | 47476667 | 27974 | 180457 | 27221.516 | 9.250 | 82.016 |
ENGVAL1 (CUTE) | 743 | 461 | 375 | 6882 | 2285 | 2702 | 3.906 | 1.047 | 1.188 |
QUARTC (CUTE) | 171 | 217 | 290 | 402 | 494 | 640 | 2.469 | 1.844 | 2.313 |
Generalized Quartic | 187 | 181 | 189 | 849 | 493 | 507 | 0.797 | 0.281 | 0.188 |
Diagonal 7 | 72 | 147 | 108 | 333 | 504 | 335 | 0.625 | 0.547 | 0.375 |
Diagonal 8 | 60 | 120 | 118 | 304 | 383 | 711 | 0.438 | 0.469 | 0.797 |
Full Hessian FH3 | 45 | 63 | 63 | 1352 | 566 | 631 | 1.438 | 0.391 | 0.391 |
Diagonal 9 | 329768 | 10540 | 13619 | 13144711 | 68189 | 89287 | 6353.172 | 43.609 | 38.672 |
IT profile | FE profile | CPU time | |||||||
Test function | HS | PRP | LS | HS | PRP | LS | HS | PRP | LS |
Perturbed Quadratic | 1157 | 1157 | 6662 | 3481 | 3481 | 19996 | 0.234 | 0.719 | 1.438 |
Raydan 2 | NaN | 174 | 40 | NaN | 373 | 120 | NaN | 0.094 | 0.078 |
Diagonal 2 | NaN | 1721 | 5007 | NaN | 6594 | 15498 | NaN | 1.313 | 2.891 |
Extended Tridiagonal 1 | NaN | 170 | 17079 | NaN | 560 | 54812 | NaN | 0.422 | 13.641 |
Diagonal 4 | NaN | 70 | 1927 | NaN | 180 | 5739 | NaN | 0.078 | 0.391 |
Diagonal 5 | NaN | 154 | 30 | NaN | 338 | 90 | NaN | 0.172 | 0.078 |
Extended Himmelblau | 160 | 120 | 241 | 820 | 600 | 1043 | 0.172 | 0.125 | 0.172 |
Full Hessian FH2 | 5096 | 5686 | 348414 | 15294 | 17065 | 1045123 | 83.891 | 80.625 | 5081.875 |
Perturbed quadratic diagonal | 1472 | 1120 | 21667 | 4419 | 3363 | 65057 | 0.438 | 0.391 | 2.547 |
Quadratic QF1 | 1158 | 1158 | 5612 | 3484 | 3484 | 16813 | 0.281 | 0.313 | 1.047 |
Extended quadratic penalty QP2 | NaN | 533 | NaN | NaN | 5395 | NaN | NaN | 0.781 | NaN |
Quadratic QF2 | 2056 | 2311 | NaN | 9168 | 9862 | NaN | 0.969 | 0.859 | NaN |
Extended quadratic exponential EP1 | NaN | NaN | 70 | NaN | NaN | 350 | NaN | NaN | 0.141 |
TRIDIA (CUTE) | 6835 | 6744 | NaN | 20521 | 20248 | NaN | 1.438 | 1.094 | NaN |
Almost Perturbed Quadratic | 1158 | 1158 | 5996 | 3484 | 3484 | 17998 | 0.281 | 0.328 | 1.063 |
LIARWHD (CUTE) | NaN | 408 | 11498 | NaN | 4571 | 50814 | NaN | 0.438 | 2.969 |
POWER (CUTE) | 7781 | 7789 | 190882 | 23353 | 23377 | 572656 | 1.422 | 1.219 | 14.609 |
NONSCOMP (CUTE) | 4545 | 3647 | NaN | 15128 | 12433 | NaN | 0.875 | 0.656 | NaN |
QUARTC (CUTE) | NaN | 165 | 155 | NaN | 1347 | 1466 | NaN | 0.781 | 0.766 |
Diagonal 6 | NaN | 174 | 137 | NaN | 373 | 442 | NaN | 0.109 | 0.125 |
DIXON3DQ (CUTE) | NaN | 12595 | 12039 | NaN | 37714 | 36091 | NaN | 1.641 | 2.859 |
BIGGSB1 (CUTE) | NaN | 11454 | 11517 | NaN | 34293 | 34530 | NaN | 1.969 | 2.141 |
Generalized Quartic | NaN | 134 | 139 | NaN | 458 | 445 | NaN | 0.125 | 0.094 |
Diagonal 7 | NaN | 51 | 80 | NaN | 142 | 240 | NaN | 0.063 | 0.109 |
Diagonal 8 | NaN | 70 | 80 | NaN | 180 | 180 | NaN | 0.063 | 0.125 |
FLETCHCR (CUTE) | 18292 | 19084 | 20354 | 178305 | 170266 | 171992 | 8.859 | 6.203 | 7.484 |
IT profile | FE profile | CPU time | |||||||
Test function | DY | FR | CD | DY | FR | CD | DY | FR | CD |
Perturbed Quadratic | 1157 | 1157 | 1157 | 3481 | 3481 | 3481 | 0.469 | 0.609 | 0.531 |
Raydan 2 | 86 | 40 | 40 | 192 | 100 | 100 | 0.063 | 0.016 | 0.016 |
Diagonal 2 | 1636 | 3440 | 2058 | 4774 | 7982 | 8063 | 0.922 | 1.563 | 1.297 |
Extended Tridiagonal 1 | 2081 | 690 | 1140 | 4639 | 2022 | 2984 | 1.703 | 1.141 | 1.578 |
Diagonal 4 | 70 | 70 | 70 | 200 | 200 | 200 | 0.047 | 0.031 | 0.016 |
Diagonal 5 | 40 | 124 | 155 | 100 | 258 | 320 | 0.109 | 0.141 | 0.125 |
Extended Himmelblau | 383 | 339 | 207 | 1669 | 1467 | 961 | 0.219 | 0.172 | 0.172 |
Full Hessian FH2 | 4682 | 4868 | 4794 | 14054 | 14610 | 14390 | 65.938 | 66.469 | 65.922 |
Perturbed quadratic diagonal | 1036 | 1084 | 1276 | 3114 | 3258 | 3834 | 0.406 | 0.422 | 0.422 |
Quadratic QF1 | 1158 | 1158 | 1158 | 3484 | 3484 | 3484 | 0.297 | 0.297 | 0.328 |
Quadratic QF2 | NaN | NaN | 2349 | NaN | NaN | 10073 | NaN | NaN | 1.531 |
Extended quadratic exponential EP1 | NaN | 60 | 60 | NaN | 310 | 310 | NaN | 0.109 | 0.125 |
Almost Perturbed Quadratic | 1158 | 1158 | 1158 | 3484 | 3484 | 3484 | 0.422 | 0.453 | 0.391 |
LIARWHD (CUTE) | 2812 | 1202 | 1255 | 12366 | 7834 | 7379 | 0.938 | 1.000 | 1.109 |
POWER (CUTE) | 7779 | 7781 | 7782 | 23347 | 23353 | 23356 | 1.078 | 1.500 | 1.328 |
NONSCOMP (CUTE) | 2558 | 13483 | 10901 | 49960 | 43268 | 33413 | 1.203 | 1.406 | 1.422 |
QUARTC (CUTE) | 134 | 94 | 95 | 1132 | 901 | 916 | 0.688 | 0.672 | 0.563 |
Diagonal 6 | 86 | 40 | 40 | 192 | 100 | 100 | 0.047 | 0.063 | 0.063 |
DIXON3DQ (CUTE) | 16047 | 18776 | 19376 | 48172 | 56369 | 58176 | 2.266 | 2.516 | 2.734 |
BIGGSB1 (CUTE) | 15274 | 17835 | 18374 | 45853 | 53546 | 55170 | 2.875 | 2.922 | 2.484 |
Generalized Quartic | 142 | 214 | 173 | 497 | 712 | 589 | 0.078 | 0.172 | 0.109 |
Diagonal 7 | 50 | 50 | 50 | 160 | 160 | 160 | 0.063 | 0.047 | 0.094 |
Diagonal 8 | 50 | 40 | 40 | 160 | 130 | 130 | 0.109 | 0.125 | 0.063 |
Full Hessian FH3 | 43 | 43 | 43 | 139 | 139 | 139 | 0.063 | 0.109 | 0.109 |
FLETCHCR (CUTE) | NaN | NaN | 26793 | NaN | NaN | 240237 | NaN | NaN | 10.203 |
Test function | HCG1 | HCG2 | HCG3 | HCG4 | HCG5 | HCG6 | HCG7 | HCG8 | HCG9 | HCG10 |
Perturbed Quadratic | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 |
Raydan 2 | 40 | 40 | 40 | 57 | 78 | 81 | 40 | 69 | NaN | 126 |
Diagonal 2 | 1584 | 1581 | 1542 | 1488 | 1500 | 2110 | 2193 | 1843 | 1475 | 1453 |
Extended Tridiagonal 1 | 805 | 623 | 754 | 2110 | 2160 | 10129 | 1167 | 966 | NaN | 270 |
Diagonal 4 | 60 | 60 | 70 | 60 | 70 | 70 | 60 | 70 | NaN | 113 |
Diagonal 5 | 124 | 39 | 98 | 39 | 120 | 109 | 39 | 141 | 154 | 130 |
Extended Himmelblau | 145 | 139 | 111 | 161 | 181 | 207 | 159 | 381 | 109 | 108 |
Full Hessian FH2 | 5036 | 5036 | 5036 | 4820 | 4820 | 4800 | 4994 | 4789 | 5163 | 5705 |
Perturbed quadratic diagonal | 1228 | 1214 | 1266 | 934 | 1093 | 987 | 996 | 1016 | NaN | 2679 |
Quadratic QF1 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | NaN | 1158 |
Quadratic QF2 | 2125 | 2098 | 2174 | 1995 | 1991 | 2425 | 2378 | NaN | 2204 | 2034 |
TRIDIA (CUTE) | NaN | NaN | NaN | 6210 | 6210 | 5594 | NaN | NaN | 6748 | 7345 |
Almost Perturbed Quadratic | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 |
LIARWHD (CUTE) | 1367 | 817 | 1592 | 1024 | 1831 | 1774 | 531 | 2152 | NaN | 573 |
POWER (CUTE) | 7782 | 7782 | 7782 | 7779 | 7779 | 7802 | 7781 | 7780 | NaN | 7781 |
NONSCOMP (CUTE) | 10092 | 10746 | 8896 | 10466 | 9972 | 13390 | 11029 | 3520 | 3988 | 11411 |
QUARTC (CUTE) | 94 | 160 | 145 | 150 | 126 | 95 | 160 | 114 | 165 | 154 |
Diagonal 6 | 40 | 40 | 40 | 57 | 78 | 81 | 40 | 69 | NaN | 126 |
DIXON3DQ (CUTE) | 12182 | 5160 | 11257 | 5160 | 11977 | 14302 | 5160 | 17080 | NaN | 12264 |
BIGGSB1 (CUTE) | 10664 | 5160 | 10479 | 5160 | 11082 | 13600 | 5160 | 16192 | NaN | 11151 |
Generalized Quartic | 129 | 107 | 110 | 107 | 142 | 153 | 107 | 123 | 131 | 145 |
Diagonal 7 | 50 | NaN | 40 | NaN | 40 | 50 | NaN | 50 | 51 | 40 |
Diagonal 8 | 40 | 40 | 40 | 50 | NaN | 50 | 40 | NaN | NaN | 40 |
Full Hessian FH3 | 43 | 42 | 42 | 42 | 42 | 43 | 42 | 43 | NaN | NaN |
FLETCHCR (CUTE) | 17821 | 17632 | 18568 | 17272 | 17446 | 26794 | 24865 | NaN | 17315 | 20813 |
Test function | HCG1 | HCG2 | HCG3 | HCG4 | HCG5 | HCG6 | HCG7 | HCG8 | HCG9 | HCG10 |
Perturbed Quadratic | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 |
Raydan 2 | 100 | 100 | 100 | 134 | 176 | 182 | 100 | 158 | NaN | 282 |
Diagonal 2 | 6136 | 6217 | 6006 | 5923 | 5944 | 8281 | 8594 | 4822 | 5711 | 5636 |
Extended Tridiagonal 1 | 2369 | 1991 | 2275 | 4678 | 4924 | 22418 | 3119 | 2661 | NaN | 869 |
Diagonal 4 | 170 | 170 | 200 | 170 | 200 | 200 | 170 | 200 | NaN | 339 |
Diagonal 5 | 258 | 88 | 206 | 88 | 270 | 228 | 88 | 292 | 338 | 270 |
Extended Himmelblau | 855 | 687 | 583 | 763 | 813 | 961 | 757 | 1613 | 567 | 594 |
Full Hessian FH2 | 15115 | 15115 | 15115 | 14467 | 14467 | 14407 | 14989 | 14374 | 15495 | 17122 |
Perturbed quadratic diagonal | 3686 | 3647 | 3805 | 2805 | 3282 | 2967 | 2993 | 3053 | NaN | 8044 |
Quadratic QF1 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | NaN | 3484 |
Quadratic QF2 | 9455 | 9202 | 9501 | 9016 | 9054 | 10229 | 10086 | NaN | 9531 | 9085 |
TRIDIA (CUTE) | NaN | NaN | NaN | 18640 | 18640 | 16792 | NaN | NaN | 20260 | 22051 |
Almost Perturbed Quadratic | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 |
LIARWHD (CUTE) | 7712 | 5931 | 8275 | 6165 | 8113 | 9395 | 5854 | 10305 | NaN | 4848 |
POWER (CUTE) | 23356 | 23356 | 23356 | 23347 | 23347 | 23416 | 23353 | 23350 | NaN | 23353 |
NONSCOMP (CUTE) | 31355 | 33211 | 27801 | 32705 | 31458 | 40807 | 34013 | 23411 | 13367 | 35106 |
QUARTC (CUTE) | 901 | 1254 | 1261 | 1224 | 1224 | 916 | 1254 | 1041 | 1347 | 1305 |
Diagonal 6 | 100 | 100 | 100 | 134 | 176 | 182 | 100 | 158 | NaN | 282 |
DIXON3DQ (CUTE) | 36508 | 15534 | 33759 | 15534 | 35926 | 42952 | 15534 | 51284 | NaN | 36796 |
BIGGSB1 (CUTE) | 31960 | 15534 | 31427 | 15534 | 33247 | 40846 | 15534 | 48620 | NaN | 33469 |
Generalized Quartic | 457 | 371 | 370 | 371 | 481 | 529 | 371 | 439 | 446 | 467 |
Diagonal 7 | 160 | NaN | 130 | NaN | 130 | 160 | NaN | 160 | 142 | 13 |
Diagonal 8 | 130 | 130 | 130 | 160 | NaN | 160 | 130 | NaN | NaN | 130 |
Full Hessian FH3 | 139 | 136 | 136 | 136 | 136 | 139 | 136 | 139 | NaN | NaN |
FLETCHCR (CUTE) | 166463 | 165774 | 168739 | 175309 | 175845 | 240240 | 184939 | NaN | 174406 | 215687 |
Test function | HCG1 | HCG2 | HCG3 | HCG4 | HCG5 | HCG6 | HCG7 | HCG8 | HCG9 | HCG10 |
Perturbed Quadratic | 0.656 | 0.516 | 0.781 | 0.719 | 0.594 | 0.438 | 0.719 | 0.688 | 0.844 | 0.688 |
Raydan 2 | 0.031 | 0.063 | 0.078 | 0.078 | 0.078 | 0.078 | 0.078 | 0.078 | NaN | 0.078 |
Diagonal 2 | 1.453 | 1.328 | 1.656 | 1.172 | 1.438 | 1.797 | 1.813 | 1.266 | 1.250 | 1.141 |
Extended Tridiagonal 1 | 1.016 | 1.125 | 1.359 | 2.250 | 2.375 | 7.578 | 1.672 | 1.375 | NaN | 0.922 |
Diagonal 4 | 0.031 | 0.031 | 0.031 | 0.078 | 0.078 | 0.047 | 0.109 | 0.094 | NaN | 0.094 |
Diagonal 5 | 0.141 | 0.063 | 0.156 | 0.094 | 0.094 | 0.125 | 0.109 | 0.078 | 0.219 | 0.156 |
Extended Himmelblau | 0.172 | 0.172 | 0.109 | 0.141 | 0.172 | 0.141 | 0.125 | 0.141 | 0.172 | 0.125 |
Full Hessian FH2 | 83.125 | 91.938 | 86.984 | 85.766 | 94.484 | 78.281 | 77.141 | 74.500 | 80.969 | 82.469 |
Perturbed quadratic diagonal | 0.406 | 0.609 | 0.641 | 0.375 | 0.563 | 0.359 | 0.328 | 0.344 | NaN | 0.734 |
Quadratic QF1 | 0.359 | 0.438 | 0.422 | 0.422 | 0.406 | 0.391 | 0.484 | 0.422 | NaN | 0.281 |
Quadratic QF2 | 1.047 | 1.313 | 1.203 | 1.156 | 1.063 | 1.156 | 1.000 | NaN | 1.094 | 1.047 |
TRIDIA (CUTE) | NaN | NaN | NaN | 1.688 | 1.391 | 1.859 | NaN | NaN | 1.875 | 1.391 |
Almost Perturbed Quadratic | 0.406 | 0.438 | 0.516 | 0.594 | 0.250 | 0.359 | 0.406 | 0.578 | 0.641 | 0.422 |
LIARWHD (CUTE) | 0.938 | 0.828 | 1.203 | 0.797 | 1.125 | 1.172 | 0.938 | 1.203 | NaN | 0.594 |
POWER (CUTE) | 1.563 | 1.672 | 1.750 | 1.609 | 1.625 | 1.578 | 1.625 | 1.188 | NaN | 1.453 |
NONSCOMP (CUTE) | 1.547 | 1.484 | 1.063 | 1.766 | 1.422 | 1.719 | 1.516 | 1.063 | 1.203 | 1.703 |
QUARTC (CUTE) | 0.750 | 1.000 | 0.969 | 0.969 | 0.875 | 0.797 | 0.938 | 0.703 | 1.266 | 0.93 |
Diagonal 6 | 0.078 | 0.078 | 0.078 | 0.094 | 0.063 | 0.016 | 0.016 | 0.125 | NaN | 0.109 |
DIXON3DQ (CUTE) | 2.047 | 1.453 | 2.016 | 1.484 | 2.359 | 2.234 | 1.406 | 2.297 | NaN | 2.078 |
BIGGSB1 (CUTE) | 1.875 | 2.047 | 2.359 | 1.750 | 2.250 | 2.391 | 1.422 | 2.672 | NaN | 2.422 |
Generalized Quartic | 0.063 | 0.125 | 0.141 | 0.156 | 0.125 | 0.094 | 0.078 | 0.109 | 0.172 | 0.109 |
Diagonal 7 | 0.063 | NaN | 0.016 | NaN | 0.109 | 0.063 | NaN | 0.063 | 0.063 | 0.063 |
Diagonal 8 | 0.078 | 0.125 | 0.078 | 0.031 | NaN | 0.063 | 0.109 | NaN | NaN | 0.078 |
Full Hessian FH3 | 0.063 | 0.047 | 0.109 | 0.047 | 0.031 | 0.063 | 0.047 | 0.109 | NaN | NaN |
FLETCHCR (CUTE) | 5.656 | 6.750 | 7.922 | 9.484 | 6.484 | 8.766 | 7.281 | NaN | 6.906 | 7.547 |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Label | T1 | T2 | T3 | T4 | T5 | T6 |
Value of the scalar |
0.05 | 0.1 | 0.2 | 0.5 | 0.9 |
Method | T1 | T2 | T3 | T4 | T5 | T6 |
DHSDL | 32980.14 | 31281.32 | 33640.45 | 32942.36 | 34448.32 | 33872.36 |
DLSDL | 30694.00 | 28701.14 | 31048.32 | 30594.77 | 31926.59 | 31573.05 |
MHSDL | 29289.73 | 27653.64 | 29660.00 | 29713.50 | 30491.18 | 30197.27 |
MLSDL | 25398.82 | 22941.77 | 24758.27 | 24250.68 | 25722.64 | 25032.64 |
Method | T1 | T2 | T3 | T4 | T5 | T6 |
DHSDL | 1228585.50 | 1191960.55 | 1252957.09 | 1238044.36 | 1271176.59 | 1255710.45 |
DLSDL | 1131421.41 | 1083535.14 | 1149482.41 | 1134315.00 | 1167030.14 | 1158554.77 |
MHSDL | 1089700.41 | 1036710.32 | 1089777.64 | 1091985.41 | 1105299.91 | 1101380.18 |
MLSDL | 904217.14 | 845017.55 | 891669.50 | 879473.14 | 913165.68 | 895652.36 |
Method | T1 | T2 | T3 | T4 | T5 | T6 |
DHSDL | 902.06 | 894.73 | 917.77 | 930.56 | 911.28 | 870.93 |
DLSDL | 816.08 | 790.63 | 804.69 | 816.28 | 803.84 | 809.67 |
MHSDL | 770.78 | 751.65 | 728.61 | 749.70 | 712.64 | 720.57 |
MLSDL | 573.14 | 587.41 | 581.50 | 576.32 | 582.62 | 580.96 |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Quasi-Newton Eqs. | Ref. | |
[104] | ||
[72] | ||
[123] | ||
[133] | ||
[134] | ||
[59] |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Title | Year | Reference | |
1952 | [60] | ||
1964 | [48] | ||
1967 | [38] | ||
1969 | [102,103] | ||
1987 | [47] | ||
1991 | [79] | ||
1999 | [30] |
Denominator | |||
Numerator | |||
FR | DY | CD | |
PRP | HS | LS |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
IT profile | FE profile | CPU time | |||||||
Test function | AGD | MSM | SM | AGD | MSM | SM | AGD | MSM | SM |
Perturbed Quadratic | 353897 | 34828 | 59908 | 13916515 | 200106 | 337910 | 6756.047 | 116.281 | 185.641 |
Raydan 1 | 22620 | 26046 | 14918 | 431804 | 311260 | 81412 | 158.359 | 31.906 | 36.078 |
Diagonal 3 | 120416 | 7030 | 12827 | 4264718 | 38158 | 69906 | 5527.844 | 52.609 | 102.875 |
Generalized Tridiagonal 1 | 670 | 346 | 325 | 9334 | 1191 | 1094 | 11.344 | 1.469 | 1.203 |
Extended Tridiagonal 1 | 3564 | 1370 | 4206 | 14292 | 10989 | 35621 | 55.891 | 29.047 | 90.281 |
Extended TET | 443 | 156 | 156 | 3794 | 528 | 528 | 3.219 | 0.516 | 0.594 |
Diagonal 4 | 120 | 96 | 96 | 1332 | 636 | 636 | 0.781 | 0.203 | 0.141 |
Extended Himmelblau | 396 | 260 | 196 | 6897 | 976 | 668 | 1.953 | 0.297 | 0.188 |
Perturbed quadratic diagonal | 2542050 | 37454 | 44903 | 94921578 | 341299 | 460028 | 44978.750 | 139.625 | 185.266 |
Quadratic QF1 | 366183 | 36169 | 62927 | 13310016 | 208286 | 352975 | 12602.563 | 81.531 | 138.172 |
Extended quadratic penalty QP1 | 210 | 369 | 271 | 2613 | 2196 | 2326 | 1.266 | 1.000 | 0.797 |
Extended quadratic penalty QP2 | 395887 | 1674 | 3489 | 9852040 | 11491 | 25905 | 3558.734 | 3.516 | 6.547 |
Quadratic QF2 | 100286 | 32727 | 64076 | 3989239 | 183142 | 353935 | 1582.766 | 73.438 | 132.703 |
Extended quadratic exponential EP1 | 48 | 100 | 73 | 990 | 894 | 661 | 0.750 | 0.688 | 0.438 |
Extended Tridiagonal 2 | 1657 | 659 | 543 | 8166 | 2866 | 2728 | 3.719 | 1.047 | 1.031 |
ARWHEAD (CUTE) | 5667 | 430 | 270 | 214284 | 5322 | 3919 | 95.641 | 1.969 | 1.359 |
Almost Perturbed Quadratic | 356094 | 33652 | 60789 | 14003318 | 194876 | 338797 | 13337.125 | 73.047 | 133.516 |
LIARWHD (CUTE) | 1054019 | 3029 | 18691 | 47476667 | 27974 | 180457 | 27221.516 | 9.250 | 82.016 |
ENGVAL1 (CUTE) | 743 | 461 | 375 | 6882 | 2285 | 2702 | 3.906 | 1.047 | 1.188 |
QUARTC (CUTE) | 171 | 217 | 290 | 402 | 494 | 640 | 2.469 | 1.844 | 2.313 |
Generalized Quartic | 187 | 181 | 189 | 849 | 493 | 507 | 0.797 | 0.281 | 0.188 |
Diagonal 7 | 72 | 147 | 108 | 333 | 504 | 335 | 0.625 | 0.547 | 0.375 |
Diagonal 8 | 60 | 120 | 118 | 304 | 383 | 711 | 0.438 | 0.469 | 0.797 |
Full Hessian FH3 | 45 | 63 | 63 | 1352 | 566 | 631 | 1.438 | 0.391 | 0.391 |
Diagonal 9 | 329768 | 10540 | 13619 | 13144711 | 68189 | 89287 | 6353.172 | 43.609 | 38.672 |
IT profile | FE profile | CPU time | |||||||
Test function | HS | PRP | LS | HS | PRP | LS | HS | PRP | LS |
Perturbed Quadratic | 1157 | 1157 | 6662 | 3481 | 3481 | 19996 | 0.234 | 0.719 | 1.438 |
Raydan 2 | NaN | 174 | 40 | NaN | 373 | 120 | NaN | 0.094 | 0.078 |
Diagonal 2 | NaN | 1721 | 5007 | NaN | 6594 | 15498 | NaN | 1.313 | 2.891 |
Extended Tridiagonal 1 | NaN | 170 | 17079 | NaN | 560 | 54812 | NaN | 0.422 | 13.641 |
Diagonal 4 | NaN | 70 | 1927 | NaN | 180 | 5739 | NaN | 0.078 | 0.391 |
Diagonal 5 | NaN | 154 | 30 | NaN | 338 | 90 | NaN | 0.172 | 0.078 |
Extended Himmelblau | 160 | 120 | 241 | 820 | 600 | 1043 | 0.172 | 0.125 | 0.172 |
Full Hessian FH2 | 5096 | 5686 | 348414 | 15294 | 17065 | 1045123 | 83.891 | 80.625 | 5081.875 |
Perturbed quadratic diagonal | 1472 | 1120 | 21667 | 4419 | 3363 | 65057 | 0.438 | 0.391 | 2.547 |
Quadratic QF1 | 1158 | 1158 | 5612 | 3484 | 3484 | 16813 | 0.281 | 0.313 | 1.047 |
Extended quadratic penalty QP2 | NaN | 533 | NaN | NaN | 5395 | NaN | NaN | 0.781 | NaN |
Quadratic QF2 | 2056 | 2311 | NaN | 9168 | 9862 | NaN | 0.969 | 0.859 | NaN |
Extended quadratic exponential EP1 | NaN | NaN | 70 | NaN | NaN | 350 | NaN | NaN | 0.141 |
TRIDIA (CUTE) | 6835 | 6744 | NaN | 20521 | 20248 | NaN | 1.438 | 1.094 | NaN |
Almost Perturbed Quadratic | 1158 | 1158 | 5996 | 3484 | 3484 | 17998 | 0.281 | 0.328 | 1.063 |
LIARWHD (CUTE) | NaN | 408 | 11498 | NaN | 4571 | 50814 | NaN | 0.438 | 2.969 |
POWER (CUTE) | 7781 | 7789 | 190882 | 23353 | 23377 | 572656 | 1.422 | 1.219 | 14.609 |
NONSCOMP (CUTE) | 4545 | 3647 | NaN | 15128 | 12433 | NaN | 0.875 | 0.656 | NaN |
QUARTC (CUTE) | NaN | 165 | 155 | NaN | 1347 | 1466 | NaN | 0.781 | 0.766 |
Diagonal 6 | NaN | 174 | 137 | NaN | 373 | 442 | NaN | 0.109 | 0.125 |
DIXON3DQ (CUTE) | NaN | 12595 | 12039 | NaN | 37714 | 36091 | NaN | 1.641 | 2.859 |
BIGGSB1 (CUTE) | NaN | 11454 | 11517 | NaN | 34293 | 34530 | NaN | 1.969 | 2.141 |
Generalized Quartic | NaN | 134 | 139 | NaN | 458 | 445 | NaN | 0.125 | 0.094 |
Diagonal 7 | NaN | 51 | 80 | NaN | 142 | 240 | NaN | 0.063 | 0.109 |
Diagonal 8 | NaN | 70 | 80 | NaN | 180 | 180 | NaN | 0.063 | 0.125 |
FLETCHCR (CUTE) | 18292 | 19084 | 20354 | 178305 | 170266 | 171992 | 8.859 | 6.203 | 7.484 |
IT profile | FE profile | CPU time | |||||||
Test function | DY | FR | CD | DY | FR | CD | DY | FR | CD |
Perturbed Quadratic | 1157 | 1157 | 1157 | 3481 | 3481 | 3481 | 0.469 | 0.609 | 0.531 |
Raydan 2 | 86 | 40 | 40 | 192 | 100 | 100 | 0.063 | 0.016 | 0.016 |
Diagonal 2 | 1636 | 3440 | 2058 | 4774 | 7982 | 8063 | 0.922 | 1.563 | 1.297 |
Extended Tridiagonal 1 | 2081 | 690 | 1140 | 4639 | 2022 | 2984 | 1.703 | 1.141 | 1.578 |
Diagonal 4 | 70 | 70 | 70 | 200 | 200 | 200 | 0.047 | 0.031 | 0.016 |
Diagonal 5 | 40 | 124 | 155 | 100 | 258 | 320 | 0.109 | 0.141 | 0.125 |
Extended Himmelblau | 383 | 339 | 207 | 1669 | 1467 | 961 | 0.219 | 0.172 | 0.172 |
Full Hessian FH2 | 4682 | 4868 | 4794 | 14054 | 14610 | 14390 | 65.938 | 66.469 | 65.922 |
Perturbed quadratic diagonal | 1036 | 1084 | 1276 | 3114 | 3258 | 3834 | 0.406 | 0.422 | 0.422 |
Quadratic QF1 | 1158 | 1158 | 1158 | 3484 | 3484 | 3484 | 0.297 | 0.297 | 0.328 |
Quadratic QF2 | NaN | NaN | 2349 | NaN | NaN | 10073 | NaN | NaN | 1.531 |
Extended quadratic exponential EP1 | NaN | 60 | 60 | NaN | 310 | 310 | NaN | 0.109 | 0.125 |
Almost Perturbed Quadratic | 1158 | 1158 | 1158 | 3484 | 3484 | 3484 | 0.422 | 0.453 | 0.391 |
LIARWHD (CUTE) | 2812 | 1202 | 1255 | 12366 | 7834 | 7379 | 0.938 | 1.000 | 1.109 |
POWER (CUTE) | 7779 | 7781 | 7782 | 23347 | 23353 | 23356 | 1.078 | 1.500 | 1.328 |
NONSCOMP (CUTE) | 2558 | 13483 | 10901 | 49960 | 43268 | 33413 | 1.203 | 1.406 | 1.422 |
QUARTC (CUTE) | 134 | 94 | 95 | 1132 | 901 | 916 | 0.688 | 0.672 | 0.563 |
Diagonal 6 | 86 | 40 | 40 | 192 | 100 | 100 | 0.047 | 0.063 | 0.063 |
DIXON3DQ (CUTE) | 16047 | 18776 | 19376 | 48172 | 56369 | 58176 | 2.266 | 2.516 | 2.734 |
BIGGSB1 (CUTE) | 15274 | 17835 | 18374 | 45853 | 53546 | 55170 | 2.875 | 2.922 | 2.484 |
Generalized Quartic | 142 | 214 | 173 | 497 | 712 | 589 | 0.078 | 0.172 | 0.109 |
Diagonal 7 | 50 | 50 | 50 | 160 | 160 | 160 | 0.063 | 0.047 | 0.094 |
Diagonal 8 | 50 | 40 | 40 | 160 | 130 | 130 | 0.109 | 0.125 | 0.063 |
Full Hessian FH3 | 43 | 43 | 43 | 139 | 139 | 139 | 0.063 | 0.109 | 0.109 |
FLETCHCR (CUTE) | NaN | NaN | 26793 | NaN | NaN | 240237 | NaN | NaN | 10.203 |
Test function | HCG1 | HCG2 | HCG3 | HCG4 | HCG5 | HCG6 | HCG7 | HCG8 | HCG9 | HCG10 |
Perturbed Quadratic | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 | 1157 |
Raydan 2 | 40 | 40 | 40 | 57 | 78 | 81 | 40 | 69 | NaN | 126 |
Diagonal 2 | 1584 | 1581 | 1542 | 1488 | 1500 | 2110 | 2193 | 1843 | 1475 | 1453 |
Extended Tridiagonal 1 | 805 | 623 | 754 | 2110 | 2160 | 10129 | 1167 | 966 | NaN | 270 |
Diagonal 4 | 60 | 60 | 70 | 60 | 70 | 70 | 60 | 70 | NaN | 113 |
Diagonal 5 | 124 | 39 | 98 | 39 | 120 | 109 | 39 | 141 | 154 | 130 |
Extended Himmelblau | 145 | 139 | 111 | 161 | 181 | 207 | 159 | 381 | 109 | 108 |
Full Hessian FH2 | 5036 | 5036 | 5036 | 4820 | 4820 | 4800 | 4994 | 4789 | 5163 | 5705 |
Perturbed quadratic diagonal | 1228 | 1214 | 1266 | 934 | 1093 | 987 | 996 | 1016 | NaN | 2679 |
Quadratic QF1 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | NaN | 1158 |
Quadratic QF2 | 2125 | 2098 | 2174 | 1995 | 1991 | 2425 | 2378 | NaN | 2204 | 2034 |
TRIDIA (CUTE) | NaN | NaN | NaN | 6210 | 6210 | 5594 | NaN | NaN | 6748 | 7345 |
Almost Perturbed Quadratic | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 | 1158 |
LIARWHD (CUTE) | 1367 | 817 | 1592 | 1024 | 1831 | 1774 | 531 | 2152 | NaN | 573 |
POWER (CUTE) | 7782 | 7782 | 7782 | 7779 | 7779 | 7802 | 7781 | 7780 | NaN | 7781 |
NONSCOMP (CUTE) | 10092 | 10746 | 8896 | 10466 | 9972 | 13390 | 11029 | 3520 | 3988 | 11411 |
QUARTC (CUTE) | 94 | 160 | 145 | 150 | 126 | 95 | 160 | 114 | 165 | 154 |
Diagonal 6 | 40 | 40 | 40 | 57 | 78 | 81 | 40 | 69 | NaN | 126 |
DIXON3DQ (CUTE) | 12182 | 5160 | 11257 | 5160 | 11977 | 14302 | 5160 | 17080 | NaN | 12264 |
BIGGSB1 (CUTE) | 10664 | 5160 | 10479 | 5160 | 11082 | 13600 | 5160 | 16192 | NaN | 11151 |
Generalized Quartic | 129 | 107 | 110 | 107 | 142 | 153 | 107 | 123 | 131 | 145 |
Diagonal 7 | 50 | NaN | 40 | NaN | 40 | 50 | NaN | 50 | 51 | 40 |
Diagonal 8 | 40 | 40 | 40 | 50 | NaN | 50 | 40 | NaN | NaN | 40 |
Full Hessian FH3 | 43 | 42 | 42 | 42 | 42 | 43 | 42 | 43 | NaN | NaN |
FLETCHCR (CUTE) | 17821 | 17632 | 18568 | 17272 | 17446 | 26794 | 24865 | NaN | 17315 | 20813 |
Test function | HCG1 | HCG2 | HCG3 | HCG4 | HCG5 | HCG6 | HCG7 | HCG8 | HCG9 | HCG10 |
Perturbed Quadratic | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 | 3481 |
Raydan 2 | 100 | 100 | 100 | 134 | 176 | 182 | 100 | 158 | NaN | 282 |
Diagonal 2 | 6136 | 6217 | 6006 | 5923 | 5944 | 8281 | 8594 | 4822 | 5711 | 5636 |
Extended Tridiagonal 1 | 2369 | 1991 | 2275 | 4678 | 4924 | 22418 | 3119 | 2661 | NaN | 869 |
Diagonal 4 | 170 | 170 | 200 | 170 | 200 | 200 | 170 | 200 | NaN | 339 |
Diagonal 5 | 258 | 88 | 206 | 88 | 270 | 228 | 88 | 292 | 338 | 270 |
Extended Himmelblau | 855 | 687 | 583 | 763 | 813 | 961 | 757 | 1613 | 567 | 594 |
Full Hessian FH2 | 15115 | 15115 | 15115 | 14467 | 14467 | 14407 | 14989 | 14374 | 15495 | 17122 |
Perturbed quadratic diagonal | 3686 | 3647 | 3805 | 2805 | 3282 | 2967 | 2993 | 3053 | NaN | 8044 |
Quadratic QF1 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | NaN | 3484 |
Quadratic QF2 | 9455 | 9202 | 9501 | 9016 | 9054 | 10229 | 10086 | NaN | 9531 | 9085 |
TRIDIA (CUTE) | NaN | NaN | NaN | 18640 | 18640 | 16792 | NaN | NaN | 20260 | 22051 |
Almost Perturbed Quadratic | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 | 3484 |
LIARWHD (CUTE) | 7712 | 5931 | 8275 | 6165 | 8113 | 9395 | 5854 | 10305 | NaN | 4848 |
POWER (CUTE) | 23356 | 23356 | 23356 | 23347 | 23347 | 23416 | 23353 | 23350 | NaN | 23353 |
NONSCOMP (CUTE) | 31355 | 33211 | 27801 | 32705 | 31458 | 40807 | 34013 | 23411 | 13367 | 35106 |
QUARTC (CUTE) | 901 | 1254 | 1261 | 1224 | 1224 | 916 | 1254 | 1041 | 1347 | 1305 |
Diagonal 6 | 100 | 100 | 100 | 134 | 176 | 182 | 100 | 158 | NaN | 282 |
DIXON3DQ (CUTE) | 36508 | 15534 | 33759 | 15534 | 35926 | 42952 | 15534 | 51284 | NaN | 36796 |
BIGGSB1 (CUTE) | 31960 | 15534 | 31427 | 15534 | 33247 | 40846 | 15534 | 48620 | NaN | 33469 |
Generalized Quartic | 457 | 371 | 370 | 371 | 481 | 529 | 371 | 439 | 446 | 467 |
Diagonal 7 | 160 | NaN | 130 | NaN | 130 | 160 | NaN | 160 | 142 | 13 |
Diagonal 8 | 130 | 130 | 130 | 160 | NaN | 160 | 130 | NaN | NaN | 130 |
Full Hessian FH3 | 139 | 136 | 136 | 136 | 136 | 139 | 136 | 139 | NaN | NaN |
FLETCHCR (CUTE) | 166463 | 165774 | 168739 | 175309 | 175845 | 240240 | 184939 | NaN | 174406 | 215687 |
Test function | HCG1 | HCG2 | HCG3 | HCG4 | HCG5 | HCG6 | HCG7 | HCG8 | HCG9 | HCG10 |
Perturbed Quadratic | 0.656 | 0.516 | 0.781 | 0.719 | 0.594 | 0.438 | 0.719 | 0.688 | 0.844 | 0.688 |
Raydan 2 | 0.031 | 0.063 | 0.078 | 0.078 | 0.078 | 0.078 | 0.078 | 0.078 | NaN | 0.078 |
Diagonal 2 | 1.453 | 1.328 | 1.656 | 1.172 | 1.438 | 1.797 | 1.813 | 1.266 | 1.250 | 1.141 |
Extended Tridiagonal 1 | 1.016 | 1.125 | 1.359 | 2.250 | 2.375 | 7.578 | 1.672 | 1.375 | NaN | 0.922 |
Diagonal 4 | 0.031 | 0.031 | 0.031 | 0.078 | 0.078 | 0.047 | 0.109 | 0.094 | NaN | 0.094 |
Diagonal 5 | 0.141 | 0.063 | 0.156 | 0.094 | 0.094 | 0.125 | 0.109 | 0.078 | 0.219 | 0.156 |
Extended Himmelblau | 0.172 | 0.172 | 0.109 | 0.141 | 0.172 | 0.141 | 0.125 | 0.141 | 0.172 | 0.125 |
Full Hessian FH2 | 83.125 | 91.938 | 86.984 | 85.766 | 94.484 | 78.281 | 77.141 | 74.500 | 80.969 | 82.469 |
Perturbed quadratic diagonal | 0.406 | 0.609 | 0.641 | 0.375 | 0.563 | 0.359 | 0.328 | 0.344 | NaN | 0.734 |
Quadratic QF1 | 0.359 | 0.438 | 0.422 | 0.422 | 0.406 | 0.391 | 0.484 | 0.422 | NaN | 0.281 |
Quadratic QF2 | 1.047 | 1.313 | 1.203 | 1.156 | 1.063 | 1.156 | 1.000 | NaN | 1.094 | 1.047 |
TRIDIA (CUTE) | NaN | NaN | NaN | 1.688 | 1.391 | 1.859 | NaN | NaN | 1.875 | 1.391 |
Almost Perturbed Quadratic | 0.406 | 0.438 | 0.516 | 0.594 | 0.250 | 0.359 | 0.406 | 0.578 | 0.641 | 0.422 |
LIARWHD (CUTE) | 0.938 | 0.828 | 1.203 | 0.797 | 1.125 | 1.172 | 0.938 | 1.203 | NaN | 0.594 |
POWER (CUTE) | 1.563 | 1.672 | 1.750 | 1.609 | 1.625 | 1.578 | 1.625 | 1.188 | NaN | 1.453 |
NONSCOMP (CUTE) | 1.547 | 1.484 | 1.063 | 1.766 | 1.422 | 1.719 | 1.516 | 1.063 | 1.203 | 1.703 |
QUARTC (CUTE) | 0.750 | 1.000 | 0.969 | 0.969 | 0.875 | 0.797 | 0.938 | 0.703 | 1.266 | 0.93 |
Diagonal 6 | 0.078 | 0.078 | 0.078 | 0.094 | 0.063 | 0.016 | 0.016 | 0.125 | NaN | 0.109 |
DIXON3DQ (CUTE) | 2.047 | 1.453 | 2.016 | 1.484 | 2.359 | 2.234 | 1.406 | 2.297 | NaN | 2.078 |
BIGGSB1 (CUTE) | 1.875 | 2.047 | 2.359 | 1.750 | 2.250 | 2.391 | 1.422 | 2.672 | NaN | 2.422 |
Generalized Quartic | 0.063 | 0.125 | 0.141 | 0.156 | 0.125 | 0.094 | 0.078 | 0.109 | 0.172 | 0.109 |
Diagonal 7 | 0.063 | NaN | 0.016 | NaN | 0.109 | 0.063 | NaN | 0.063 | 0.063 | 0.063 |
Diagonal 8 | 0.078 | 0.125 | 0.078 | 0.031 | NaN | 0.063 | 0.109 | NaN | NaN | 0.078 |
Full Hessian FH3 | 0.063 | 0.047 | 0.109 | 0.047 | 0.031 | 0.063 | 0.047 | 0.109 | NaN | NaN |
FLETCHCR (CUTE) | 5.656 | 6.750 | 7.922 | 9.484 | 6.484 | 8.766 | 7.281 | NaN | 6.906 | 7.547 |
Algorithm 6 Modified Dai-Liao conjugate gradient methods. |
Require: A starting point 1: Set 2: If STOP; else perform Step 3. 3: Determine 4: Compute 5: Calculate 6: Calculate 7: Compute 8: |
Label | T1 | T2 | T3 | T4 | T5 | T6 |
Value of the scalar |
0.05 | 0.1 | 0.2 | 0.5 | 0.9 |
Method | T1 | T2 | T3 | T4 | T5 | T6 |
DHSDL | 32980.14 | 31281.32 | 33640.45 | 32942.36 | 34448.32 | 33872.36 |
DLSDL | 30694.00 | 28701.14 | 31048.32 | 30594.77 | 31926.59 | 31573.05 |
MHSDL | 29289.73 | 27653.64 | 29660.00 | 29713.50 | 30491.18 | 30197.27 |
MLSDL | 25398.82 | 22941.77 | 24758.27 | 24250.68 | 25722.64 | 25032.64 |
Method | T1 | T2 | T3 | T4 | T5 | T6 |
DHSDL | 1228585.50 | 1191960.55 | 1252957.09 | 1238044.36 | 1271176.59 | 1255710.45 |
DLSDL | 1131421.41 | 1083535.14 | 1149482.41 | 1134315.00 | 1167030.14 | 1158554.77 |
MHSDL | 1089700.41 | 1036710.32 | 1089777.64 | 1091985.41 | 1105299.91 | 1101380.18 |
MLSDL | 904217.14 | 845017.55 | 891669.50 | 879473.14 | 913165.68 | 895652.36 |
Method | T1 | T2 | T3 | T4 | T5 | T6 |
DHSDL | 902.06 | 894.73 | 917.77 | 930.56 | 911.28 | 870.93 |
DLSDL | 816.08 | 790.63 | 804.69 | 816.28 | 803.84 | 809.67 |
MHSDL | 770.78 | 751.65 | 728.61 | 749.70 | 712.64 | 720.57 |
MLSDL | 573.14 | 587.41 | 581.50 | 576.32 | 582.62 | 580.96 |