1.
Introduction
The NBLP problem is a nonlinear optimization problem that is constrained by another nonlinear optimization problem. This mathematical programming model arises when two independent decision makers, ordered within a hierarchical structure, have conflicting objectives. The decision maker at the lower level has to optimize her objective under the given parameters from the upper level decision maker, who, in return, with complete information on the possible reactions of the lower, selects the parameters so as to optimize her own objective. The decision maker with the upper level objective, fu(t,v) takes the lead, and chooses her decision vector t. The decision maker with lower level objective, fl(t,v), reacts accordingly by choosing her decision vector v to optimize her objective, parameterized in t. Note that the upper level decision maker is limited to influencing, rather than controlling, the lower level's outcome. In fact, the problem has been proved to be NP-hard [5]. However, the NBLP problem is used so extensively in transaction network, finance budget, resource allocation, price control etc. Various approaches have been devoted to study this field, which leads to a a speedy development in theories and algorithms, see [1,3,30,32,41]. For detailed exposition, the reader can review [23,25,35].
A mathematical formulation for the NBLP problem is
where t∈ℜn1 and v∈ℜn2.
Let n=n1+n2, and assume that the functions fu:ℜn→ℜ, fl:ℜn→ℜ, gu:ℜn→ℜm1, and gl:ℜn→ℜm2 are at least twice continuously differentiable function in our method.
Several approaches have been proposed to solve the NBLP problem 1.1, see [2,13,14,25,27,37,40,42]. KKT conditions one of these approaches and used in this paper to convert the original NBLP problem 1.1 to the following one-level programming problem:
where λl∈ℜm2 is a multiplier vector associated with inequality constraint gl(t,v). problem 1.2 is non-convex and non-differentiable, moreover the regularity assumptions which are needed to successfully handle smooth optimization problems are never satisfied and it is not good to use our approach to solve problem 1.2. Dempe [13] presents smoothing method for the NBLP problem and the same method is also presented in [28] for programming with complementary constraints. Following this smoothing method we can propose our approach for the NBLP problem. Before presenting our approach for the NBLP problem, we give some definitions firstly.
Definition 1.1. The Fischer-Burmeister function is Ψ(˜a,˜b):ℜ2→ℜ defined by Ψ(˜a,˜b)=˜a+˜b−√˜a2+˜b2 and the perturbed Fischer-Burmeister function is Ψ(˜a,˜b,ˆε):ℜ3→ℜ defined by Ψ(˜a,˜b,ϵ)=˜a+˜b−√˜a2+˜b2+ϵ.
The Fischer-Burmeister function has the property that Ψ(˜a,˜b)=0 if and only if ˜a≥0, ˜b≥0, and ˜a˜b=0. It is non-differentiable at ˜a=˜b=0. Its perturbed variant satisfies Ψ(˜a,˜b,ϵ)=0 if and only if ˜a>0, ˜b>0, and ˜a˜b=ϵ2 for ϵ>0. This function is smooth with respect to ˜a, ˜b, for ϵ>0. for more details see [8,9,10,28].
In this paper, to allow the proposed algorithm ACBTR solve the NBLP problem 1.1 and satisfy the asymptotic stability conditions, we use the following changed perturbed Fischer-Burmeister function:
It is obvious that the changed perturbed Fischer-Burmeister function ˜Ψ(˜a,˜b,ϵ) has the same property with the function Ψ(˜a,˜b,ϵ). Using the Fischer-Burmeister function 1.3, problem 1.2 equivalents to the following single objective constrained optimization problem
Let x=(t,v)T, m=n2+m2 then the above problem can be written as SONP problem as follows
where fu:ℜn→ℜ, hl:ℜn→ℜm, and gu:ℜn→ℜm1 are at least twice continuously differentiable functions.
Various approaches have been proposed to solve the SONP problem 1.5, see [4,7,16,17,18,19,24]. In this paper, we use an active-set with barrier method to reduce SONP problem 1.5 to equivalent equality constrained optimization problem. So, we can use one of methods which are used for solving equality constrained optimization problem.
In this paper, we use a trust-region technique which is successful approach for solving SONP problem and is very important to ensure global convergence from any starting point. The trust-region strategy can induce strongly global convergence. It is more robust when it deals with rounding errors. It does not require the Hessian of the objective function must be positive definite or the objective function of the model must be convex. Also, some criteria are used to test the trial step is acceptable or no. If it is not acceptable, then the subproblem must be resolved with a reduced the trust-region radius. For the detailed expositions, the reader review [17,20,21,22,23,24,33,36,45,46,47,48].
A projected Hessian method which is suggested by [6,38] and used by [19,20,22], utilizes in this paper to treat the difficulty of having an infeasible trust-region subproblem. In this method, the trial step is decomposed into two components and each component is computed by solving a trust-region unconstrained subproblem.
Under standard five assumptions, a global convergence theory for ACBTR algorithm is introduced. Moreover, numerical experiments display that ACBTR algorithm performers effectively and efficiently in pursuance.
The balance of this paper is organized as follows. A detailed description for the proposed method to solve SONP problem 1.5 is introduced in the next section. Section 3 is devoted to analysis of the global convergence of ACBTR algorithm. In Section 4, we report preliminary numerical results. Finally, some further remark is given in Section 5.
Notations: We use ‖.‖ to denote the Euclidean norm ‖.‖2. The i−th component of any vector such as x is written as x(i). The jth trial iterate of iteration k is denoted by kj. Subscript k refers to iteration indices. For example, fuk≡fu(xk), hlk≡hl(xk), guk≡gu(xk), Wk≡W(xk), ∇xLsk≡∇xLs(xk,λk;σk), and so on to denote the function value at a particular point.
2.
An active-set with barrier method and trust-region strategy
In this section, firstly, we will introduce the detailed description for the active-set strategy with barrier method to reduce SONP problem 1.5 to equality constrained optimization problem. Secondly, to solve the equality constrained optimization problem and guarantee convergence from any starting point, we will introduce the detailed description for the trust-region algorithm. Finally, we will introduce the main steps for the main algorithm ACBTR to solve NBLP problem 1.1.
2.1. An active-set strategy and barrier method
Motivated by the active-set strategy which is introduced by [12] and used by [17,18,19,20,21], we define a 0-1 diagonal matrix W(x)∈ℜm2×m2 whose diagonal entries are
where i=1,...,m2. By Using the diagonal matrix W(x)∈ℜm2×m2, we can transform problem 1.5 to the following equality constrained optimization problem with positive variables
Penalty methods usually more suitable on problems with equality constraints. These methods are usually generate a sequence of points that converges to a solution of the problem from the exterior of the feasible region. An advantage of penalty methods is that they do not request the iterates to be strictly feasible. In this paper we use the penalty method to reduce the above problem to the following equality constrained optimization problem with positive variables
where σ is a positive parameter. Let F+(x)={x|x>0}.
Motivated by the barrier method which is discussed in [[7,26,43], problem 2.2, for any x∈F+ can be written as follows
for decreasing sequence of barrier parameters s converging to zero, see [26].
The Lagrangian function associated with problem 2.3 is
where λ∈ℜm is a multiplier vector associated with the equality constraint hl(x)=0.
The first-order necessary condition for the strictly positive point x∗ to be a local minimizer of problem 2.3 is that there exists a Lagrange multiplier vector λ∗∈ℜm, such that (x∗,λ∗) satisfies the following nonlinear system
where X is diagonal matrix whose diagonal entries are (x1,...,xn)∈F+. Let sX∗−1e=y∈ℜn be an auxiliary variable, then the above system can be written as follows
where x∗∈F+. The conditions [2.5–2.7] are called the barrier KKT conditions. For more details see [26].
Applying Newton's method to the nonlinear system (2.5)–(2.7), we have
where H is the Hessian matrix of the following function or an approximation to it
The matrix Y is a diagonal matrix whose diagonal entries are (y1,...,yn) and ∇xLs(x,λ;σ)=∇fu(x)−y+∇hl(x)λ+σ∇gu(x)W(x)gu(x).
From the second equation of the system (2.8) we have
To decrease the dimension of system 2.8, we eliminate dy from the first equation of the system 2.8 by using Eq 2.9 as follows
Using Eq 2.6, we have the following system
where, B=H+X−1Y+σ∇gu(x)W(x)∇gu(x)T.
We notice that, the system 2.10 is equivalent to the first order necessary condition for the following sequential quadratic programming problem
That is, the point (x∗,λ∗) that satisfies the KKT conditions for subproblem 2.11 will satisfy the KKT conditions for problem 1.5. A methods which are used to solve subproblem 2.11 is a local methods. That is, it may not converge to a stationary point if the starting point is far away from the solution. To guarantee convergence from any starting point, we use the trust-region technique.
2.2. Trust-region strategy
By using trust-region technique to ensure convergence of subproblem 2.11 and estimate the step dk, we solve the following subproblem
where 0<δk represents the radius of the trust-region. The subproblem 2.12 may be infeasible because there may be no intersecting points between hyperplane of the linearized constraints hl(x)+∇hl(x)Td and the constraint ‖d‖≤δk. Even if they intersect, there is no guarantee that this will keep true if δk is reduced, see [11]. So, a projected Hessian technique is used in our approach to overcome this problem. This technique was suggested by [6,38] and used by [19,20,22]. In this technique, the trial step dk is decomposed into two orthogonal components: the normal component dnk to improve feasibility and the tangential component dtk to improve optimality. Each of dnk and dtk is evaluated by solving unconstrained trust-region subproblem.
● To compute the normal component dn
for some ζ∈(0,1). To solve the subproblem 2.13, we use a conjugate gradient method which is introduced by [39] and used by [23], see Algorithm 2.1 in [23]. It is very cheap if the problem is large-scale and the Hessian is indefinite. By using the conjugate gradient method, the following condition is hold
for some ϑ1∈(0,1]. That is, the normal predicted decrease obtained by the normal component dnk is greater than or equal to a fraction of the normal predicted decrease obtained by the normal Cauchy step dncpk. The normal Cauchy step dncpk is defined as
where the parameter αncpk is given by
Once dnk is estimated, we will compute dtk=Zkˉdtk. A matrix Zk is the matrix whose columns form a basis for the null space of (∇hlk)T.
● To compute the tangential component dtk
To estimate the tangential component dtk, let
and using the conjugate gradient method [23] to solve the following trust-region subproblem
where ∇qk(dnk)=∇xLsk+Bkdnk and Δk=√δ2k−‖dnk‖2.
Let a tangential predicted decrease which is obtained by the tangential component dtk be
Since the conjugate gradient method is used to solve subproblem (2.18) and estimate dtk, then the following condition holds
for some ϑ2∈(0,1]. This condition clarified that the tangential predicted decrease which is obtained by tangential step ˉdtk is greater than or equal to a fraction of the tangential predicted decrease obtained by a tangential Cauchy step ˉdtcpk. The tangential Cauchy step ˉdtcpk is defined as follows
where the parameter αtcpk is given by
such that ˉBk=ZTkBkZk.
Once estimating dtk, we set dk=dnk+dtk and xk+1=xk+dk. To guarantee that xk+1∈F+ at every iteration k, we need to evaluate the damping parameter μk.
● To estimate the damping parameter μk
The damping parameter μk is defined as follows:
where
To be decided whether the scale step μkdk will be accepted or no, we need to a merit function. The merit function is the function which is tie the objective function fu(x) with the constraints hl(x) and gu(x) in such a way that progress in the merit function means progress in solving problem. In the proposed algorithm, we use the following an augmented Lagrange function as a merit function, see [31].
where ρ>0 is a penalty parameter.
To be decided whether the point (xk+μkdk,λk+1) will be taken as a next iterate or no, we need to define the actual reduction Aredk and the predicted reduction Predk in the merit function Φs(x,λ;σ;ρ).
In the proposed algorithm, Aredk is defined as follows
Also Aredk can be written as follows,
where Δλk=(λk+1−λk).
In the proposed algorithm, Predk is defined as follows
where ∇xls(x,λ)=∇fu(x)−y+∇hl(x)λ and ˜H=H+X−1Y.
Also, Predk can be written as follows
where the quadratic form q(d) in 2.17 can be written as follows
● To update ρk
To ensure that Predk≥0, we update the penalty parameter ρk utilizing the following scheme.
Algorithm 2.1. If
then, set
where β0>0 is a small fixed constant.
Else, set
End if.
For more details, see [15,16,17,18,19].
● To test the scaling step μkdk and update δk
The framework to test the scaling step μkdk and update δk is presented in the following algorithm.
Algorithm 2.2. Choose 0<γ1<γ2<1, 0<α1<1<α2, and δmin≤δ0≤δmax.
While AredkPredk∈(0,γ1) or Predk≤0.
Set δk=α1‖dk‖ and return to evaluate a new trial step and end while.
If AredkPredk∈[γ1,γ2). Set xk+1=xk+μkdk and δk+1=max(δk,δmin).
End if.
If AredkPredk∈[γ2,1]. Set xk+1=xk+μkdk and δk+1=min{δmax,max{δmin,α2δk}}.
End if.
● To update the positive parameter σk
To update the positive parameter σk, we use the following scheme.
Algorithm 2.3. If
Set σk+1=σk.
Else, set σk+1=2σk. End if.
For more details see [18,23].
Finally, the algorithm is stopped when ‖ZTk∇xℓsk‖+‖∇gu(xk)Wkgu(xk)‖+‖hlk‖≤ε1 or ‖dk‖≤ε2, for some ε1,ε2>0.
● A trust-region algorithm
The framework of the trust-region algorithm to solve subproblem 2.12 are summarized as follows.
Algorithm 2.4. (Trust-region algorithm)
Step 0. Starting with x0∈F+. Evaluate y0 and λ0. Set s0=0.1, ρ0=1, σ0=1, and β0=0.1.
Choose ε1, ε2, α1, α2, γ1, and γ2 such that 0<ε1,
0<ε2, 0<α1<1<α2, and 0<γ1<γ2<1.
Choose δmin, δmax, and δ0 such that δmin≤δ0≤δmax. Set k=0.
Step 1. If ‖ZTk∇xℓsk‖+‖∇gu(xk)Wkgu(xk)‖+‖hlk‖≤ε1, then stop.
Step 2. (How to compute dk)
a). Evaluate the normal component dnk by solving subproblem (2.13).
b). Evaluate the tangential component ˉdtk by solving subproblem (2.18).
c). Set dk=dnk+Zkˉdtk.
Step 3. If ‖dk‖≤ε2, then stop.
Step 4. (How to compute μk)
a). Compute the damping parameter μk using (2.23).
b). Set xk+1=xk+μkdk.
Step 5. Compute the vector yk+1, by using the following equation
The above equation is obtained from (2.9).
Step 6. Compute Wk+1 given by (2.1).
Step 7. Evaluate λk+1 by solving the following subproblem
Step 8. Using scheme 2.1 to update the penalty parameter ρk.
Step 9. Using Algorithm (2.2) to test the scaled step μkdk and update the radius δk.
Step 10. Update the positive parameter σk using scheme 2.3.
Step 11. To Update the barrier parameter sk, set sk+1=sk10.
Step 12. Set k=k+1 and go to Step 1.
In the following subsection we will clarify the main steps for solving NBLP problem 1.1.
2.3. An Active-set-barrier-trust-region algorithm
The framework to solve NBLP problem 1.1 are summarized in the following algorithm.
Algorithm 2.5. (An active-set-barrier-trust-region (ACBTR) algorithm)
Step 1. Using KKT optimality conditions for the lower level problem 1.1 to reduce problem 1.1 to one-level problem 1.2.
Step 2. Using Fischer-Burmeister function 1.3 with ϵ=0.001 to obtain the smooth problem 1.4 and which is equivalent problem 1.5.
Step 3. Using An active set strategy with Barrier method to obtain subproblem 2.11.
Step 4. Using trust-region Algorithm 2.4 to solve subproblem 2.11 and obtained approximate solution for problem 1.5.
In the following section we will introduce a global convergence analysis for ACBTR algorithm.
3.
Global convergence analysis for ACBTR algorithm
let Ω be a convex subset of ℜn that contains all iterates xk∈F+ and (xk+μkdk)∈F+. To prove the global convergence theory of ACBTR algorithm on Ω, we assume that the following assumptions are hold.
● Assumptions
[A1]. The functions fu(x), hl(x), and gu(x) are twice continuously differentiable function for all 0<x∈S.
[A2]. All of fu(x), ∇fu(x), ∇2fu(x), gu(x), ∇gu(x), hl(x), ∇hl(x), ∇2hli(x) for i=1,...,m, and (∇hlk)[(∇hlk)T(∇hlk)]−1 are uniformly bounded in S.
[A3]. The columns of the matrix ∇hl(x) are linearly independent.
[A4]. The sequence {λk} is bounded.
[A5]. The sequence of matrices {˜Hk} is bounded.
In the above assumptions, even though we assume that ∇hl(x) has full column rank for all xk∈F+, we do not require ∇gu(x) has full column rank for all xk∈F+. So, we may have other kinds of stationary points which are presented in the following definitions.
Definition 3.1. A point x∗∈F+ is called a Fritz John (FJ) point if there exist γ∗, λ∗, and ν∗, not all zeros, such that
Equations (3.1)–(3.5) are called FJ conditions. More details see [4].
If τ∗≠0, then the point (x∗,1,λ∗τ∗,ν∗τ∗) is called a KKT point and FJ conditions are called the KKT conditions.
Definition 3.2. A point x∗∈F+ is called an infeasible Fritz John (IFJ) point if there exist τ∗, λ∗, and ν∗ such that
Equations (3.6)–(3.10) are called IFJ conditions.
If τ∗≠0, then the point (x∗,1,λ∗τ∗,ν∗τ∗) is called an infeasible KKT point and IFJ conditions are called infeasible KKT conditions.
Lemma 3.1. Under assumptions A1–A5, a subsequence {xki} of the iteration sequence asymptotically satisfies IFJ conditions if it satisfies:
1). limki→∞hl(xki)=0.
2). limki→∞‖Wkigu(xki)‖>0.
3). limki→∞{mind∈ℜn−m2‖Wki(guki+∇gTukiZkiμkiˉdt)‖2}=limki→∞‖Wkiguki‖2.
Proof. To simplify the notations, let the subsequence {ki} be renamed to {k}. Let ˆdk be a minimizer of minimizeˉdt‖Wk(gu(xk)+∇gu(xk)TZkμkˉdt)‖2, then it satisfies
From condition 3, we have
Now, we will consider two cases:
Firstly, if limk→∞ˆdk=0, then from (3.11) we have limk→∞μkZTk∇gu(xk)Wkgu(xk)=0.
Secondly, if limk→∞ˆdk≠0, then multiplying (3.11) from the left by 2ˆdTk and subtract it from the limit (3.12), we have limk→∞‖Wk∇gu(xk)TZkμkˆdk‖2=0. This implies limk→∞μkZTk∇gu(xk)Wkgu(xk)=0.
That is, in either case, we have
Take (νk)i=(Wkgu(xk))i, i=1,...,p. Since limk→∞‖Wkgu(xk)‖>0, then limk→∞(νk)i≥0, for i=1,...,p and limk→∞(νk)i>0, for some i. Therefore limk→∞ZTk∇gu(xk)νk=0. But this implies the existence of a sequence {λk} such that limk→∞{∇hlkλk+∇gu(xk)νk}=0. Thus IFJ conditions are hold in the limit with τ∗=0.
The following lemma clarify that, for any subsequence {xki} of the iteration sequence that asymptotically satisfies the FJ conditions, the corresponding subsequence of smallest singular values of {ZTk∇gu(xk)Wk} is not bounded away from zero. That is, asymptotically the gradient of the active constraints are linearly dependent.
Lemma 3.2. Under assumptions A1–A5, a subsequence {xki} of the iteration sequence asymptotically satisfies FJ conditions if it satisfies:
1). limki→∞h(xki)=0.
2). For all ki, ‖Wkiguki‖>0 and limki→∞Wkiguki=0.
3). limki→∞{mind∈ℜn−p‖Wki(guki+∇gTukiZkiμkiˉdt)‖2‖Wkiguki‖2}=1.
Proof. The proof of this lemma is similar to the proof of Lemma 4.4 in [19].
In the following section, we introduce some basic lemmas which are requisite to prove global convergence analysis for ACBTR algorithm.
3.1. Basic lemmas
In this section, we introduce some significant lemmas which are required to prove global convergence theory for ACBTR algorithm.
Lemma 3.3. Under assumptions A1 and A3, W(x)gu(x) is Lipschitz continuous in Ω.
Proof. The proof of this lemma is similar to the proof of Lemma 4.1 of [12].
From the above lemma, we conclude that gu(x)TW(x)gu(x) is differentiable and ∇gu(x)W(x)gu(x) is Lipschitz continuous in Ω.
Lemma 3.4. At any iteration k, let E(xk)∈ℜm2×m2 be a diagonal matrix whose diagonal entries are
where i=1,2,...,m2. Then
Proof. See Lemma 6.2 of [17].
Lemma 3.5. Under assumptions A1–A3, there exists at any iteration k, a constant C1>0 independent of k such that
where Ek∈ℜm2×m2 is the diagonal matrix whose diagonal entries are defined in (3.14).
Proof. See Lemma 6.3 of [17].
Lemma 3.6. Under assumptions A1–A3, there exists at any iteration k, a constant 0<C2 independent of k such that
Proof. Since dnk is normal to the tangent space, then we have
where ‖hlk+∇hTlkdk‖≤‖hlk‖. Using the assumptions A1–A5, we have the desired result.
The next lemma clarifies how delicate the definition of Aredk is as an approximation to Predk.
Lemma 3.7. Under assumptions A1–A5, there exists a constant 0<C3, such that
Proof. From the definition of Aredk (2.25) and using (3.15), we have
From the above equation, the definition of Predk (2.26), and using the inequality of Cauchy-Schwarz, we have
for some ξ1 and ξ2∈(0,1). By using assumptions A1–A5, ρk≥σk, ρk≥1, and inequality (3.16), we have
where κ1, κ2, and κ3 are positive constants. Since ρk≥1, ‖dk‖≤δmax, and ‖hlk‖ is uniformly bounded, then inequality (3.18) hold.
The proof of the following two lemmas depends on the fact that dnk and ˉdtk satisfy the condition of the fraction of Cauchy decrease.
Lemma 3.8. Under assumptions A1–A5, there exists a constant 0<C4 such that
Proof. From the definition of the normal Cauchy step (2.15), we will consider two cases:
Firstly, if dncpk=−δk‖∇hlkhlk‖(∇hlkhlk) and δk‖∇hTlk∇hlkhlk‖2≤‖∇hlkhlk‖3, then we have
Secondly, if dncpk=−‖∇hlkhlk‖2‖∇hTlk∇hlkhlk‖2(∇hlkhlk) and δk‖∇hTlk∇hlkhlk‖2≥‖∇hlkhlk‖3, then we have
Using assumption A3, we have ‖∇hlkhlk‖≥‖hlk‖‖(∇hTlk∇hlk)−1∇hlk‖. Hence, from inequalities (2.14), (3.21), (3.22), and using assumption A2, we obtain the inequality (3.20).
From the above lemma and the fact that
where μk∈(0,1], then we have
From the way of updating ρk shown in Step 8 in Algorithm (2.4) and above inequality, we have
Lemma 3.9. Under assumptions A1–A5, there exists a constant 0<C5, such that
Proof. From the definition of the tangential Cauchy step (2.21), we will consider two cases:
Firstly, if ˉdtcpk=−Δk‖ZTk∇qk(dnk)‖ZTk∇qk(dnk) and Δk(ZTk∇qk(dnk))TˉBkZTk∇qk(dnk)≤‖ZTk∇qk(dnk)‖3, then we have
Secondly, if ˉdtcpk=−‖ZTk∇qk(dnk)‖2ZTk∇qk(dnk))TˉBkZTk∇qk(dnk)ZTk∇qk(dnk) and Δk(ZTk∇qk(dnk))TˉBkZTk∇qk(dnk)≥‖ZTk∇qk(dnk)‖3, then we have
Hence, from inequalities (2.20), (3.26), (3.27), and using assumptions A1–A5, we obtain the desired result.
From (2.19), (3.25), and the fact that
where μk∈(0,1], then we have
That is
The following lemma clarifies that if at any iteration k, the point xk∈F+ is not feasible, then algorithm ACBTR can not loop infinitely without finding an acceptable step.
Lemma 3.10. Under assumptions A1–A5. If ‖hlk‖≥ε>0, then the condition AredkjPredkj≥γ1 will be satisfied for some finite j.
Proof. From inequalities (3.18), (3.24), and the condition ‖hlk‖≥ε, we have
Now as the trial step dkj gets rejected, δkj becomes small and eventually we will have
For j finite, this inequality implies that, the acceptance rule will be met. This completes the proof.
Lemma 3.11. Under assumptions A1–A5 and the jth trial step of iteration k satisfies,
then the step accepted.
Proof. The proof of this lemma by contradiction. Assume that the inequality (3.30) holds and the step dkj is rejected. From inequalities (3.18), (3.24), and using inequality (3.30), we have
This is a contradiction and this completes the proof.
Lemma 3.12. Under assumptions A1–A5 and for all jth trial step of any iteration k, then δkj satisfies
where b1>0 is a constant.
Proof. For all jth trial step of any iteration k, we will consider tow cases:
Firstly, if j=1 and the step accepted, then δk≥δmin. Hence,
where b1=supx∈S‖hlk‖. Then (3.31) holds in this case.
Secondly, if j>1, then there exists at least one rejected trial step and hence from Lemma (3.11) we have
for all i=1,2,...j−1. From Algorithm 2.2 and dki is a rejected trial step, then we have
From inequalities (3.32) and (3.32) the desired result is obtained.
The next lemma prove that as long as ‖hlk‖ is bound away from zero, the trust-region radius is also bound away from zero.
Lemma 3.13. Under assumptions A1–A5. If ‖hlk‖≥ε>0, then
where C6>0 is a constant.
Proof. The proof follows directly by taking
in inequality (3.31).
3.2. Global convergence theory when σk→∞
In this section, we clarify the convergence of the sequence of iteration when the positive parameter σk→∞.
Lemma 3.14. Under assumptions A1–A5. If ρk is increased at any iteration k, then
where C7 is a positive constant.
Proof. From the way of updating the positive penalty parameter ρk, we notice that ρk is increased at a given iteration k according to one of the two rules (2.31) or (2.30). Suppose that ρk is increased according to the rule (2.30), then
Using inequalities (3.23) and (3.31), then we have
According to rule (2.31), we have ρk≥σ2k. Hence
Then,
where μk≤1. Using the Cauchy-Schwarz inequality, assumptions A3–A5, and the fact that ‖dk‖≤δmax, the proof is completed.
Lemma 3.15. Under assumptions A1–A5. If σk→∞ and there exists an infinite subsequence {ki}of the iteration sequence at which ρk is increased, then
Proof. The proof follows directly from limki→∞μki=1, σk→∞, and Lemma (3.14).
Theorem 3.1. Under assumptions A1–A5. If σk→∞, then
Proof. The proof similar to the proof of Theorem 4.18 [19].
We notice from the way of updating σ that, the sequence {σk} is unbounded only when there exist an infinite subsequence of indices {ki}, at which
The following lemma shows that, if σk→∞ and lim supk→∞‖Wkgu(xk)‖>0, then the iteration sequence generated by the algorithm ACBTR has a subsequence that satisfies IFJ conditions in the limit.
Lemma 3.16. Under assumptions A1–A5. If σk→∞ and there exists a subsequence {kj} of indices indexing iterates that satisfy ‖Wkgu(xk)‖≥ε>0 for all k∈{kj}, then a subsequence of the iteration sequence indexed {kj} satisfies the IFJ conditions as k→∞.
Proof. The proof is by contradiction. Let the subsequence {kj} be renamed to {k} to simplify the notation. Suppose that there is no a subsequence of the sequence of iterates that satisfies IFJ conditions in the limit. Then we have |‖Wkgu(xk)‖2−‖Wk(gu(xk)+∇gu(xk)TZkμkˉdtk)‖2|≥ε1>0 from Lemma (3.1). Also we have ‖Zk∇gu(xk)Wkgu(xk)‖≥ε2>0 from (3.13). Since
and using (3.17), then we have
But {‖hlk‖} convergence to zero and ‖ZTk∇gu(xk)Wk∇gu(xk)T‖ is bounded. Then ‖ZTk∇gu(xk)Wk(gu(xk)+∇gu(xk)Tdnk)‖≥ε22 and therefore
Hence inequality (3.29) can be written as follows
That is for k sufficiently large we have
Since σk→∞, then there exists infinite number of acceptable iterates at which (3.38) holds. That is, there exists a contradiction unless σkΔk is bounded. Hence Δk→0 and therefore ‖dk‖→0. Now we will consider two cases:
Firstly, if ‖Wkgu(xk)‖2−‖Wk(gu(xk)+∇gu(xk)TZkμkˉdtk)‖2>ε1, we have
Thus, from (2.19), (3.39), and using assumptions A3–A5, we have Tpredk(μkˉdtk)→∞. That is, the left hand side of inequality (3.38) goes to infinity while the right hand side of the same inequality goes to zero. That is, there exists a contradiction in this case.
Secondly, if ‖Wkgu(xk)‖2−‖Wk(gu(xk)+∇gu(xk)TZkμkˉdtk)‖2<−ε1, then
where σk→∞ as k→∞. Similar to the above case, Tpredk(μkˉdtk)→−∞. This gives a contradiction in this case with Tpredk(μkˉdtk)>0. This two contradictions prove the lemma.
The following lemma shows that if limk→∞σk→∞ and lim infk→∞‖Wkgu(xk)‖=0, then the iteration sequence generated by the algorithm ACBTR has a subsequence that satisfies FJ conditions in the limit.
Lemma 3.17. Under assumptions A1–A5. Let {kj} be a subsequence of iterates that satisfy ‖Wkgu(xk)‖>0 for all k∈{kj} and limkj→∞‖Wkjgkj‖=0. If limk→∞σk=∞, then a subsequence of {kj} satisfies FJ conditions in the limit.
Proof. The proof of this lemma is similar to the proof of Lemma 4.20 [19].
3.3. Global convergence theory when σk is bounded
In this section, we will continue our discussion assuming that the parameter σk is bounded. We mean that there exists an integer ˉk such that for all k≥ˉk, σk=ˉσ<∞, and
From assumptions A3, A5, and assumption (3.40), we can say that there exists a constant 0<b2 such that for all k≥ˉk
where Bk=˜Hk+ˉσ∇gu(xk)Wk∇gu(xk)T.
Lemma 3.18. Under assumptions A1–A5, there exists a constant C8>0 such that
for all k≥ˉk.
Proof. By using the definition (2.28), we have
That is,
By using inequality (3.17), we can obtain the following inequality
From assumptions A3–A5, the fact that ‖dnk‖≤δmax, and using (3.41), then for all k≥ˉk there exists a constant C8>0 such that inequality (3.42) hold. This completes the proof.
Lemma 3.19. Under assumptions A1–A5, we have
for all k≥ˉk.
Proof. Since the definition of Predk (2.27) can be written as follows
and by using (2.19), we have
Using inequalities (3.29), (3.40), and (3.42), we can obtain the desired result.
Lemma 3.20. Under assumptions A1–A5. If ρk increased at iteration k, then there exists a constant C9>0 such that
Proof. Since ρk is increased at iteration k, then from (2.30) we have
Applying inequality (3.23) to the left hand side and inequalities (3.29), (3.40), and (3.42) to the right hand side, we obtain
The rest of the proof follows using the fact that μk≤1 and assumption A3.
Lemma 3.21. Under assumptions A1–A5. If ‖ZTk(∇xℓs(xk,λk)+ˉσ∇gu(xk)Wkgu(xk))‖+‖∇gu(xk)Wkgu(xk)‖≥ε>0 and ‖hlk‖≤ηδk where η>0 is given by
then there exists a constant C10>0, such that
Proof. Suppose that ‖ZTk(∇xℓs(xk,λk)+ˉσ∇gu(xk)Wkgu(xk))‖≥ε2, then ‖∇gu(xk)Wkgu(xk)‖≥ε2. From inequality (3.17) and using (3.41), we have
Since η≤ε6b2C2δmax, then we have
Because Δk=√δk2−‖dnk‖2 and ‖dnk‖≤C2‖hlk‖≤C2ηδk≤C2√32C2δk=√32δk, hence Δ2k=δ2k−‖dnk‖2≥δ2k−34δ2k=14δ2k. Thus,
From inequalities (3.43), (3.47) and (3.48), we have
Since η≤min{C5ε12C8min{2ε3δmax,1},ε4C8min{εδmax,1}}, then we have
The result follows if we take C10=min{C5ε24min{2ε3δmax,1},ε8min{2εδmax,1}}.
We can easily see from lemma 3.21 that, at any iteration at which ‖ZTk(∇xℓs(xk,λk)+ˉσ∇gu(xk)Wkgu(xk))+∇gu(xk)Wkgu(xk)‖≥ε and ‖hlk‖≤ηδk, where η is given by (3.45), there is no need to increase the value of ρk. It is only increased when ‖hlk‖≥ηδk.
Lemma 3.22. Under assumptions A1–A5. If ρkj increased at the jth trial iterate of any iteration k, then
where C11>0 is a constant.
Proof. The proof of this lemma follows directly from inequalities (3.31) and (3.44).
Lemma 3.23. Under assumptions A1–A5. If ρk→∞, then
where {ki} is a subsequence of iterates at which the penalty parameter is increased.
Proof. The proof of this lemma follows directly from Lemma 3.22 and limk→∞μk=1.
3.4. Main global convergence theory
In this section, we will prove the main global convergence theorems for the proposed algorithm ACBTR.
Theorem 3.2. Assume that assumptions A1–A5 hold, then the sequence of iterates generated by ACBTR algorithmsatisfies
Proof. Suppose that lim supk→∞‖hlk‖≥ε, where ε>0 is a constant. Then there exists an infinite subsequence of indices {kj} indexing iterates that satisfy ‖hkj‖≥ε2. From Lemma (3.10), we know that there exists an infinite sequence of acceptable steps, so to simplify, we assume that all members of the sequence {kj} are acceptable iterates. Now we will consider two cases:
Firstly, we consider that, if {ρk} is unbounded. Then there exists an infinite number of iterates {ki} at which ρk is increased. From Lemma (3.23) and for k sufficiently large, we can say {ki}⋂{kj}=∅. Let kζ1 and kζ2 be two consecutive iterates at which ρk is increased and kζ1<k<kζ2, for any k∈{kj}. Notice that, ρk is the same for all iterates between kζ1 and kζ2. Since all the iterates of {kj} are acceptable, then
for all k∈{kj}. Using inequality (3.24), we have
Summing over all acceptable iterates that lie between kζ1 and kζ2, we have
where ^C6 is as C6 in (3.34), with ε is replaced by ε2. Hence,
Since ρk→∞, then for kζ1 sufficiently large, we have
Therefore,
But this leads to a contradiction with Lemma (3.23) unless ε=0.
Secondly, if {ρk} is bounded, then there exists an integer ˜k such that for all k≥˜k, ρk=˜ρ. Hence from inequality (3.24), we have for any ˆk∈{kj} and ˆk≥˜k
Since all the iterates of {kj} are acceptable, then for any ˆk∈{kj}, we have
Using inequality (3.52), we have
Using Lemma (3.13), we have
Thus there exists a contradiction with the fact that {Φk} is bounded when the sequence of the penalty parameter {ρk} is bounded. Hence, in both cases the supposition is not correct and the theorem is proved.
Theorem 3.3. Under assumptions A1–A5, the sequence of iterates generated by ACBTR algorithm satisfies
Proof. To prove this theorem we will prove
by contradiction. That is, we assume ‖ZTk(∇xℓsk+ˉσ∇gu(xk)Wkgu(xk))‖+‖∇gu(xk)Wkgu(xk)‖>ε and there exists an infinite subsequence {ki} of the iteration sequence such that ‖hlki‖>ηδki. Since ‖hlki‖→0 as ki→0, then
Let kj be any iteration in {ki}. Then we will consider two cases:
Firstly, if {ρk} is unbounded and the trial step j−1 of iteration k is rejected. Thus ‖hlk‖>ηδkj=α1η‖dkj−1‖. Hence, from inequalities (3.24), (3.19), and dkj−1 was rejected, we have
Since {ρk} is unbounded, then there exists an iterate ˆk sufficiently large such that for all k≥ˆk, we have
and
From the way of updating the radius of the trust region, we have
But this is a contradiction and this means that δkj can not go to zero in this case.
Secondly, if {ρk} is bounded and there exists an integer ˉk and a constant ˉρ such that for all k≥ˉk, ρk=ˉρ. Let j be a trial step of iteration k at which ‖hk‖>ηδkj. Now we will consider the following two cases:
I). If j=1, then from our way of updating the radius of the trust-region, we have δkj≥δmin. That is, δkj is bounded in this case.
II). If j>1 and ‖hlk‖>ηδkl for all l=1,⋯,j, then for all rejected trial steps l=1,⋯,j−1 of iteration k, we have
That is
This means that, δkj is bounded.
Otherwise, if j>1 and ‖hlk‖>ηδkl holds for some l, then there exists an integer β1 such that ‖hlk‖>ηδkl holds for l=β1+1,...,j and ‖hlk‖≤ηδkl for l=1,...,β1. As in the above case, we can write
But from the way of updating the radius of the trust-region, we have
Since ‖hlk‖≤ηδkl for l=1,...,β1, then from Lemma (3.21) and the fact that dkβ1 is rejected, we have
This implies
This implies that, ‖dkβ1‖ is bounded. Hence, δkj is bounded in this case too. But this is a contradiction. That is ‖hlk‖≤ηδkj for all kj sufficiently large.
Letting kj≥ˉk and using Lemma (3.21), we have
As k→∞, then
That is δkj is not bounded below. But this leads to a contradiction and to prove this contradiction we will consider the following two cases:
i). If kj>ˉk and the step was accepted at j=1, then δk≥δmin. Hence δkj is bounded in this case.
ii). If j>1 and there exists at least one rejected trial step dkj−1. Then from Lemmas (3.7) and (3.21), we have
From the way of updating δkj we have
Hence δkj is bounded in this case too. But this contradicts (3.57). This means that, the supposition is incorrect. Hence,
But this also implies (3.53). This completes the proof of the theorem.
From the above two theorems, we conclude that, given any ε>0, the algorithm terminates because ‖ZTk∇xℓsk‖+‖∇gu(xk)Wkgu(xk)‖+‖hlk‖<ε, for some finite k.
4.
Numerical results
Algorithm ACBTR was implemented as a MATLAB code and run under MATLAB version 8.2.701 (R2013b) 64-bit(win64). We begin by a starting point x0∈F+ and the following parameter setting is used: δmin=10−4, δ0=max(‖dcp0‖,δmin), δmax=104δ0, γ1=10−4, γ2=0.75, α1=0.5, α2=2 and ε=10−8.
Secondly, an extensive variety of possible numeric NBLP problems are introduced to clarify the effectiveness of the proposed ACBTR algorithm.
For each test problem, 10 independent runs with different initial starting point are proceeded to observe the matchmaking of the results. Statistical results of all test problems are summarized in Table 1. The results in Table 1 show that the resuls by the ACBTR Algorithm (2.5) are approximate or equal to those by the compared algorithms in the literature.
In Table 1, we adding the average of number of iterations (iter), the average of number of function evaluations (nfunc), the average of value of CPU time (CPUs) per seconds.
For comparison, we have included the corresponding results of the avarge value of CPU time (CPUs) which are obtained by Methods in [34] (Table 2), [29] (Table 3), and [44] (Table 4) respectively. It is obviously from the results that our algorithm ACBTR is qualified for treating NBLP problems even the upper and the lower levels are convex or not and the results converge to the optimal solution which is similarly or approximate to the optimal that reported in literature. Finally, it is obviously from the comparison between the solutions obtained by using ACBTR algorithm with literature, that ACBTR is able to find the optimal solution of all problems by a small number of iterations, small number of function evaluations, and less time.
Problem 1 [34]:
Problem 2 [34]:
Problem 3 [34]:
Problem 4 [34]:
Problem 5 [34]:
Problem 6 [34]:
Problem 7 [34]:
Problem 8 [34]:
Problem 9 [29]:
Problem 10 [29]:
Problem 11 [44]:
Problem 12 [29]:
Problem 13 [44]:
Problem 14 [44]:
5.
Conclusions
In this paper, we introduce an effective solution algorithm to solve NBLP problem with positive variables. This algorithm based on using KKT condition with Fischer-Burmeister function to transform NBLP problem into an equivalent smooth SONP problem. An active-set strategy with barrier method and the trust-region mechanism is used to ensure global convergence from any starting point. ACBTR algorithm can reduce the number of iteration and the number of function evaluation. The projected Hessian mechanism is used in ACBTR algorithm to overcome the difficulty of having an infeasible trust region subproblem. A global convergence theory of ACBTR algorithm is studied under five standard assumptions.
Preliminary numerical experiment on the algorithm is presented. The performance of the algorithm is reported. The numerical results show that our approach is of value and merit further investigation. For future work, there are many question should be answered
● Our approach used to transform problem 1.2 which is not smooth to smooth problem.
● Using the interior-point method guarantees the converges quadratically to a stationary point.
Conflict of interest
The authors declare that there is no conflict of interest in this paper.