To avoid singular generalized Jacobian matrix and further accelerate the convergence of the generalized Newton (GN) iteration method for solving generalized absolute value equations Ax - B|x| = b, in this paper we propose a new relaxed generalized Newton (RGN) iteration method by introducing a relaxation iteration parameter. The new RGN iteration method involves the well-known GN iteration method and the Picard iteration method as special cases. Theoretical analyses show that the RGN iteration method is well defined and globally linearly convergent under suitable conditions. In addition, a specific sufficient condition is studied when the coefficient matrix A is symmetric positive definite. Finally, two numerical experiments arising from the linear complementarity problems are used to illustrate the effectiveness of the new RGN iteration method.
1.
Introduction
Consider the following generalized absolute value equations (GAVE)
where A,B∈Rn×n are two given matrices and b∈Rn is a given vector, and |⋅| denotes the componentwise absolute value. Especially, if B=I, where I stands for the identity matrix, then the GAVE (1.1) reduces to the standard absolute value equations (AVE)
The GAVE (1.1) and the AVE (1.1) arise in many scientific computing and engineering problems, including the linear programming problems, the linear complementarity problems (LCP), the bimatrix games, the quadratic programming and so on, see [1,2,3,4] for more details. Taking the well-known LCP as an example: for a given matrix M∈Rn×n and a given vector q∈Rn, find two vectors z,w∈Rn such that
Here and thereafter, (⋅)T denotes the transpose of either a vector or a matrix. By simply letting z=|x|−x and w=|x|+x, then the LCP (1.3) can be equivalently transformed into the GAVE
In fact, the GAVE (1.4) can also be transformed into the LCP (1.3) [5,6]. For more details of the relation between the GAVE (1.1) and the LCP (1.3), please see [7,8,9].
In recent decades, much more attention has been paid to the GAVE (1.1) and the AVE (1.2). On one hand, some sufficient conditions have been studied to guarantee the existence and uniqueness of the solution of the GAVE (1.1) [10,11,12,13,14]. Rohn showed that the GAVE (1.1) is uniquely solvable for any b∈Rn if σmin(A)>σmax(|B|) [10]. Wu and Li extended the results of [10], and exhibited that the GAVE (1.1) is uniquely solvable for any b∈Rn if σmin(A)>σmax(B) [11,12]. In [13], Rohn et al. showed that the GAVE (1.1) is uniquely solvable for any b∈Rn if ρ(|A−1B|)<1. Here and in the sequel, σmin(⋅), σmax(⋅) and ρ(⋅) denote the minimal singular value, the maximal singular value and the spectral radius of the corresponding matrix, respectively. On the other hand, many efficient iteration methods, including the concave minimization algorithm [15,16], the sign accord algorithm [17], the optimization algorithm [18], the hybrid algorithm [19], the preconditioned AOR iterative method [20], the Picard-HSS iteration method [21], the Newton-type method [22,23,24] and so on, have been studied for solving the GAVE (1.1).
Due to the existence of the nonlinear term B|x|, the GAVE (1.1) can be regarded as a system of nonlinear equations
As a result, the well-known Newton iteration method
can be used provided that the Jacobian matrix F′(x) of F(x) exists and is invertible. However, the Newton iteration method (1.6) can not be used directly to solve the GAVE (1.1) since F(x)=Ax−B|x|−b is non-differentiable. For the special case, i.e., B=I, by considering F(x)=Ax−|x|−b as a piece-wise linear vector function, Mangasarian in [22] used the generalized Jacobian ∂|x| of |x| based on a subetaadient of its components and presented the following generalized Newton (GN) iteration method
to get an approximate solution of the AVE (1.2), where D(x(k))=∂|x|=diag(sign(x(k))) and sign(x) stands for a vector with components equal to 1, 0, or −1 depending on whether the corresponding component of x is positive, zero or negative. Theoretical analysis showed that the GN iteration method (1.7) is globally linearly convergent under certain conditions [22]. Hu et al. extended the GN iteration scheme (1.7) to solve the GAVE (1.1) and proposed a weaker convergence condition [25]. For a general matrix B, the specific GN iteration scheme is
Recently, convergence results of the GN iteration schemes (1.7) and (1.8) have been further discussed in [26,27,28]. From the GN iteration scheme (1.7) or (1.8), we can see that the coefficient matrix A−D(xk) or A−BD(xk) is changed at each iteration step. For large problems, it is very difficult especially if the coefficient matrix is ill-conditioned. In addition, if the generalized Jacobian matrix is singular, then the GN iteration method fails. To remedy these, a lot of effective improvements have been presented, such as a stable and quadratic locally convergent iteration scheme [28], the generalized Traub's method [29], the modified GN iteration method [24,30], the inexact semi-smooth Newton iteration method [31], a new two-step iterative method [32] and so on. All these improvements greatly accelerate the convergence rate of the GN iteration method. However, when the singular generalized Jacobian matrix happens, these newly developed iteration methods fail, too.
In this paper, by introducing a relaxation iteration parameter, we propose a relaxed generalized Newton (RGN) iteration method to solve the GAVE (1.1). In fact, the RGN iteration method is a generalization of the GN iteration method [22] and the Picard iteration method [13] studied recently. The advantage of the new RGN iteration method is twofold. By introducing suitable iteration parameter, it not only can avoid singularity of the generalized Jacobian matrix, but also improves the convergence rate. Theoretically, we prove that the RGN iteration method is well defined and globally linearly convergent under certain conditions. Moreover, a specific sufficient convergence condition is presented when the coefficient matrix A is symmetric positive definite. With two numerical examples, we show that the new proposed RGN iteration method is much more efficient than some existing Newton-type iteration methods.
The rest of this paper is organized as follows. In Section 2, the RGN iteration method is introduced to solve the GAVE (1.1). Convergence analyses are studied in detail in Section 3. In Section 4, two numerical examples from the LCP (1.3) are presented to demonstrate the effectiveness of our new method. Finally, we end this paper with some conclusions and outlook in Section 5.
2.
The relaxed generalized Newton iteration method
In this section, a new relaxed generalized Newton iteration method is introduced to solve the GAVE (1.1).
Let θ≥0 be a nonnegative real parameter. Based on the Newton iteration scheme (1.6) and the ideas studied [22,30], a new iteration scheme is introduced to solve the GAVE (1.1)
Substituting F(xk)=Axk−B|xk|−b (1.5) and the generalized Jacobian ∂F(xk)=A−BD(xk) into (2.1), we obtain
Since D(x)=diag(sign(x)) is a diagonal matrix and satisfies D(xk)xk=|xk| [22]. Then the new iteration scheme (2.1) or (2.2) is simplified into the following final form
Here, the iteration parameter θ can be regarded as a role of relaxation, which can avoid the singularity problems and adjust the condition number of the coefficient matrix A−θBD(xk) so as to improve the convergence rate of the GN iteration method (1.8). So, we call the new iteration method (2.3) the relaxed generalized Newton (RGN) iteration method. In particular, if θ=1, then the RGN iteration method (2.3) reduces to the GN iteration method (1.8). If θ=0, then the RGN iteration method (2.3) becomes
which is known as the Picard iteration method [7,13].
The RGN iteration method (2.3) is well defined provided that the coefficient matrix A−θBD(xk) is nonsingular at each iteration step. The following theorem presents a sufficient condition. To this end, we first define a set of matrices
since the diagonal matrix D(x)=diag(sign(x)) may change at each iteration step.
Theorem 2.1. Let A,B∈Rn×n, θ≥0 be a nonnegative real parameter and D∈Rn×n be any matrix in the set D (2.5). Let λmin(ATA) and λmax(BTB) be the smallest eigenvalue of ATA and the largest eigenvalue of BTB, respectively. If
then A−θBD is nonsingular and the RGN iteration method (2.3) is well defined.
Proof. We argue it by contradiction. If A−θBD is singular, then there exists a nonzero vector x such that
In addition, since D∈Rn×n is a diagonal matrix with each diagonal element being 1, −1 or 0, DTD is a diagonal matrix, too, and each diagonal element is either 1 or 0. Thus, it holds
Then we have the following contradiction
Therefore, A−θBD is nonsingular and the RGN iteration method (2.3) is well defined provided that the condition (2.6) holds.
Remark 2.1. It should be noted that the condition given in Theorem 2.1 is a theoretical generalization of some recent works. In particular, if B=I and θ=1, then the condition (2.6) becomes λmin(ATA)>1, which means that all singular values of A exceed 1 and is in good agreement with [22, Lemma 2.1]. If only θ=1, then the condition (2.6) is the one given in [25, Theorem 3.1]. In addition, if θ=0, then the condition (2.6) is equivalent to show that the matrix A is nonsingular, which clearly shows that the Picard iteration method (2.4) is well defined.
To better implement the new RGN iteration method (2.3) in actual computations, we present an algorithmic version of the RGN iteration method (2.3) as follows. Here, ‖⋅‖2 denotes the Euclidean norm of either a vector or a matrix.
Algorithm 2.1. (The RGN iteration method)
1). Choose an arbitrary initial vector x0 and a nonnegative parameter θ. Given ε and set k=0;
2). If ‖Axk−B|xk|−b‖2≤ε‖b‖2, stop;
3). Compute D(xk)=diag(sign(xk));
4). Solve the following linear system to obtain xk+1
5). Set k=k+1. Go to Step 2.
3.
Convergence analysis
In this section, we will establish the convergence theory of the RGN iteration method (2.3) for solving the GAVE (1.1). Specially, two general convergence conditions of the RGN iteration method (2.3) will be presented firstly. Then, a sufficient convergence condition is proposed when the system matrix A is symmetric positive definite. In addition, as the special cases of our new RGN iteration method (2.3), the convergence conditions of the GN iteration method (1.8) and the Picard iteration method (2.4) can be acquired immediately.
3.1. General sufficient convergence conditions
In this subsection, we first study some sufficient convergence conditions only when the RGN iteration method (2.3) is well defined.
Theorem 3.1. Let A,B∈Rn×n, θ≥0 be a nonnegative real parameter and satisfy the condition (2.6). Let D∈Rn×n be any matrix in the set D (2.5). If
then the RGN iteration method (2.3) converges linearly from any starting point to a solution x∗ of the GAVE (1.1).
Proof. Let x∗ be a solution of the GAVE (1.1), then it satisfies
Subtracting (3.2) from (2.3), we obtain
Hence,
By assumptions, we have
Taking the Euclidean norm on both sides of (3.3), we obtain
where the inequality ‖ |xk|−|x∗| ‖2≤‖xk−x∗‖2 is used. From (3.4), we can see that the RGN iteration method (2.3) converges linearly from any starting point to a solution x∗ of the GAVE (1.1) provided that the condition (3.1) is satisfied.
Theorem 3.2. Under the conditions in Theorem 3.1. Further assume that A is nonsingular. If
then the RGN iteration method (2.3) converges linearly from any starting point to a solution x∗ of the GAVE (1.1).
Proof. According to Theorem 3.1, we only need to verify the condition (3.1). Under the condition (3.5), by the Banach perturbation lemma (see [33,Lemma 2.3.3]), we have
Therefore, the RGN iteration method (2.3) converges linearly from any starting point to a solution x∗ of the GAVE (1.1) if the condition (3.5) is satisfied.
As mentioned in Section 2, the well-known GN iteration method (1.8) and the Picard iteration method (2.4) are special cases of the new RGN iteration method (2.3) when the relaxation parameter θ is 1 and 0, respectively. By simply letting θ=1 and 0, we can obtain the following two corollaries, which describe the convergence conditions of the GN iteration method (1.8) and the Picard iteration method (2.4), respectively, for solving the GAVE (1.1).
Corollary 3.1. Let A,B∈Rn×n, and D∈Rn×n be any matrix in the set D (2.5). Assume that A−BD is nonsingular. If
or if A is nonsingular and
then the GN iteration method (1.8) converges linearly from any starting point to a solution x∗ of the GAVE (1.1).
Corollary 3.2. Let A,B∈Rn×n. Assume that A is nonsingular. If
then the Picard iteration method (2.4) converges linearly from any starting point to a solution x∗ of the GAVE (1.1).
3.2. The case of symmetric positive definite
In this subsection, we turn to discuss the convergence conditions of the RGN iteration method (2.3) for solving the GAVE (1.1) when the system matrix A is symmetric positive definite.
Theorem 3.3. Let A∈Rn×n be a symmetric positive definite matrix, B∈Rn×n, D∈Rn×n be any matrix in the set D (2.5), θ be a positive constant and satisfy (2.6). Further assume that μmin is the smallest eigenvalue of the matrix A and ‖B‖2=τ. If
then the RGN iteration method (2.3) converges linearly from any starting point to a solution x∗ of the GAVE (1.1).
Proof. Since the matrix A is symmetric positive definite, it is easy to check that
If μmin and τ further satisfy the condition (3.6), we have
Therefore, by Theorem 3.2, we obtain that the RGN iteration method (2.3) converges linearly from any starting point to a solution x∗ of the GAVE (1.1). This completes the proof.
In Theorem 3.3, by setting θ=1 and 0, we have the following two corollaries to guarantee the convergence of the GN iteration method (1.8) and the Picard iteration method (2.4) for solving the GAVE (1.1), respectively.
Corollary 3.3. Let A∈Rn×n be a symmetric positive definite matrix and μmin be its smallest eigenvalue. Let B∈Rn×n and ‖B‖2=τ. Let D∈Rn×n be any matrix in the set D (2.5). If
then the GN iteration method (1.8) converges linearly from any starting point to a solution x∗ of the GAVE (1.1).
Corollary 3.4. Let A∈Rn×n be a symmetric positive definite matrix and μmin be its smallest eigenvalue. Let B∈Rn×n and ‖B‖2=τ. If
then the Picard iteration method (2.4) converges linearly from any starting point to a solution x∗ of the GAVE (1.1).
4.
Numerical experiments
In this section, we present some numerical examples arising from two types of LCP (1.3) to show the effectiveness of the RGN iteration method (2.6), and demonstrate the advantages of the new RGN iteration method over the well-known Lemke's method, the Picard iteration method (2.4), the GN iteration method (1.8) and the modified GN iteration (MGN) method [30] from aspects of the iteration counts (denoted by “IT”) and the elapsed CPU times (denoted by “CPU”). The first LCP comes from [4], which have been studied for standard test problem by many researchers. The second LCP arising from the practical traffic single bottleneck models [34], which are usually solved by the well-known Lemke's method. Note that the Lemke's method is one of the most efficient direct methods for solving the LCP (1.3).
In our experiments, the initial guess vector is the zero vector. All runs are terminated once
or if the prescribed iteration number kmax=5000 is exceeded. At each iteration step of the RGN iteration method, the Picard iteration method, the GN iteration method and the MGN iteration method, we need to solve a system of linear equations with the coefficient matrix A−θBD, A, A−BD and A+I−BD respectively. Here, these systems of linear equations are solved by the sparse LU factorization when these coefficient matrices are nonsymmetric and by the sparse Cholesky factorization when these coefficient matrices are symmetric positive definite. To efficiently implement the RGN iteration method, we need to choose the relaxation iteration parameter θ in advance. The convergence rates of all parameter dependent iteration methods heavily depend on the particular choice of the iteration parameter. The analytic determination of the value θ which results in the fastest convergence of the RGN iteration method appears to be quite a difficult problem. Here, the relaxation iteration parameter θ used in the new RGN iteration method is chosen to be the experimentally optimal one θexp, which leads to the smallest iteration step. In the following tables, “-” means that the corresponding iteration method does not converge to the approximate solution within kmax iteration steps or even diverges. All computations are run in MATLAB (version R2014a) in double precision, Intel(R) Core(TM) (i5-3337U CPU, 8G RAM) Windows 8 system. Here, we use the Matlab codes presented in https://ww2.mathworks.cn/matlabcentral/fileexchange/41485 to test the Lemke's method.
Example 4.1. ([4]) Consider the LCP (1.3), in which M∈Rn×n is given by M=ˆM+μI∈Rn×n and q∈Rn is given by q=−Mz(∗), where
is a block-tridiagonal matrix,
is a tridiagonal matrix, and
is the unique solution of the LCP (1.3). Here n=m2. From the discussion presented in Section 1, we can see that the LCP (1.3) can be equivalently expressed as the GAVE (1.1), where A=M+I and B=M−I. The exact solution of the GAVE (1.1) is
For the first example, we take two cases of the parameter μ, i.e., μ=0 and μ=−1, in actual computations. For each parameter μ, four increasing sizes, i.e., m=30,60,90,120, are considered. The corresponding dimensions for each problem are n=900,3600,8100,14400, respectively. Note that for the case μ=0, both the matrices M and A are symmetric positive definite, while for the case μ=−1, the matrix M is symmetric indefinite and the matrix A is symmetric positive definite. In Tables 1 and 2, we list the numerical results of different methods for μ=0 and μ=−1, respectively.
From Tables 1 and 2, we can see that the GN iteration method and the new RGN iteration method perform much better than the other three computational methods in terms of both iteration steps and elapsed CPU times. For the case μ=0, the Lemke's method can not converge to a satisfactory solution for large problems. Although the Picard iteration method and the MGN iteration method converge, the numerical results show that the convergence rates of these two iteration methods are very slow. We have also noticed that the iteration steps of the Picard iteration step and the MGN iteration step are almost the same, but the elapsed CPU times show that the MGN iteration method costs much more expensive than the Picard iteration method. The reason is that the coefficient matrix of the MGN iteration method is changed at each iteration step. The best choice of the relaxation iteration parameter in the RGN iteration method is θexp=1, which means that the GN iteration method is the best one. For the case μ=−1, the Lemke's method computes the exact solution for all test problems, but the iteration steps and the elapsed CPU times show that this method is not competitive in actual computations. The Picard iteration method diverges. This is because the matrix M is indefinite and the convergence conditions in Corollary 3.2 and Corollary 3.4 can not be satisfied. Among the GN iteration method, the MGN iteration method and the RGN iteration method, we can see from the numerical results that the new RGN iteration method is the best one.
Example 4.2. ([4]) Consider the LCP (1.3), in which M=ˆM++μI∈Rn×n and q=−Mz(∗). Different from Example 4.1, we assume that the matrix ˆM in the second example is nonsymmetric, i.e.,
is a block-tridiagonal matrix,
is a tridiagonal matrix, and
is the unique solution of the LCP (1.3).
Similar to Example 4.1, the second example can also be equivalently expressed as the GAVE (1.1). Note that Example 4.2 has the same exact solution as that of Example 4.1. For the second example, we still take two parameters for μ, i.e., μ=0 and μ=−1. For each parameter μ, we consider four increasing sizes, i.e., m=30,60,90,120. Thus, the total dimensions for each problem are n=900,3600,8100,14400, respectively. Different from Example 4.1, both the matrices M and A are nonsymmetric positive definite for the case μ=0. For the case μ=−1, the matrix M is nonsymmetric indefinite and the matrix A is nonsymmetric positive definite.
Tables 3 and 4 list the corresponding numerical results of different methods for μ=0 and μ=−1, respectively. These numerical results further confirm the observations obtained from Tables 1 and 2, i.e., the GN iteration method and the new RGN iteration method are superior to other three computational methods in terms of computing efficiency. For the case μ=0, the Lemke's method converge very slow for small problems and do not converge within the given maximum iteration step for large problems. The other four computational methods are convergent. However, the Picard iteration method and the MGN iteration method converge very slow. The GN iteration method and the new RGN iteration method have the same computational results and converge very fast. That means the GN iteration method is the best one. Most important, the iteration steps of both the GN iteration method and the new proposed RGN iteration method are constant as the problem sizes grow. For the case μ=−1, the Lemke's method performs much better than itself for the case μ=0. However, the computational results show that the Lemke's method is still not competitive in real applications. The Picard iteration method is still divergent. The reason is the same as that in Example 4.1. The GN iteration method, the MGN iteration method and the proposed RGN iteration method converge faster than the Lemke's method. From these numerical results, we see again that the RGN iteration method performs best among the three Newton-based iteration methods.
Example 4.3. ([34]) The second example comes from the single bottleneck model with both the homogeneous commuters and the heterogeneous commuters. The dynamic equilibrium conditions for the single bottleneck model can be transformed into the LCP (1.3), in which the system matrix M and the vector q have the following specific structure
where the submatrices M1∈R(ΥG)×Υ, M2∈R(ΥG)×(ΥG), M3∈RG×(ΥG), S∈RΥ×Υ are
the subvectors are
and
Here, τ∈T=[0,1,2,⋯,Υ] and g∈G=[1,2,⋯,G] are the indexes for the time interval and the user group, respectively. When G=1 and G>1, then the LCP (1.3) can be used to study the homogeneous case and the heterogeneous case, respectively. s denotes the bottleneck capacity with units given by number of vehicles per time interval and Ng denotes the number of individuals in group g. αg, βg and γg are the unit costs (or value) of the travel time, arriving early to work and arriving late to work in group g, respectively. τ∗g is the preferred arrival time in group g. 1 stands for a vector of all ones.
For the second example, the total dimension is n=2ΥG+Υ+G. It is proved in [34] that the system matrix M is a copositive matrix and the LCP (1.3) has a unique solution. In [34], the authors used the Lemke's method to solve such problems and simulate the single bottleneck model for both the homogeneous case (G=1) and the heterogenous case (G=3). In addition, they took the total demand N=25, the bottleneck capacity s=3 for time unit and the preferred arrival time τ∗=7. The time duration is 10 time units. For the homogeneous case, i.e., G=1, the unit costs are taken as α=2, β=1 and γ=4 for per time unit. For the heterogeneous case, i.e., G=3, the unit costs are taken as α:β:γ rations=2:1:4 and τ∗g=6,7,8 for groups 1−3, respectively. For more detailed selection of these parameters, please see [34]. Here, the corresponding LCP is further equivalently transformed into the GAVE (1.1), where A=M+I and B=M−I. Then the Picard iteration method, the GN iteration method, the MGN iteration method and the new RGN iteration method are applied to solve the GAVE.
Numerical results of different computational methods are listed in Tables 5 and 6 for G=1 and G=3, respectively. From these two tables, we can see that the Lemke's method is successfully applied to solve the single bottleneck model, but the elapsed CPU times indicate that the Lemke's method costs very expensive. The GN iteration method fails to solve the test problems. This is because the singularity of the coefficient matrix A−BD(xk) occurs in implementing the GN iteration method. The Picard iteration method and the MGN iteration method can only be applied to some small problems. For large problems, these two iteration methods fail to converge within the prescribed largest iteration number. Our new RGN iteration method can be successfully applied to solve all the test problems and costs very cheap. Therefore, the new RGN iteration method is a powerful computational method to solve the GAVE (1.1).
5.
Conclusion
In this paper, by introducing a relaxation iteration parameter, a new relaxed generalized Newton (RGN) iteration method is proposed to solve the generalized absolute value equations. We have proved that the RGN iteration method is well defined and converges globally under certain conditions. Two numerical examples, both arising from the well-known LCP problem, are used to illustrate the efficiency of the new computational method. Numerical results show that the RGN iteration method converges and has much better computing efficiency than some existing methods provided that suitable relaxation iteration parameters are chosen.
Just like most of the parameter-based iteration methods, the choice of the iteration parameter is an open and a challenging problem. Here, the RGN iteration method is proved to be only linearly convergent. In some recent works, the GN iteration method has been modified to be a globally and quadratically iteration method under very strong conditions. Therefore, how to improve the RGN iteration method needs further-in-depth studies. In addition, the generalized absolute value equations with general nonlinear term, which arise in nonlinear complementarity problems [35], implicit complementarity problems [36,37], quasi complementarity problems [38,39], are of great interesting. Future work should focus on estimating the quasi-optimal value of the relaxation iteration parameter, finding globally and quadratically convergent RGN iteration method, extending to solve more applications and so on.
Acknowledgments
The author are very much indebted to An Wang for writing the Matlab codes. This work is supported by the National Natural Science Foundation of China (Nos. 11771225, 61771265, 71771127), the Humanities and Social Science Foundation of the Ministry of Education of China (No. 18YJCZH274), the Science and Technology Project of Nantong City (No. JC2018142) and the ‘226’ Talent Scientific Research Project of Nantong City.
Conflict of interest
The authors declare there is no conflict of interest.