1.
Introduction
The Chebyshev-Halley method is a popular iterative method for solving the simple roots of the nonlinear equation f(x)=0. In fact, the Chebyshev-Halley method has been first provided by Hernández and Salanova [1]. Gutiérrez and Hernández [2] have provided a modification for the Chebyshev-Halley type iterative methods in Banach spaces. The base for constructing the family is the third-order method
where a∈R, L(xk)=f″(xk)f(xk)f′(xk)2.
In 2008, Osada [3] gave two modifications of family of Chebyshev-Halley methods for analytic functions. With the developments of the theory of iteration processes, Kim et al. [4] proposed several new families of Chebyshev-Halley type methods based on weight function. Ivanov [5] established convergence theorems of Chebyshev-Halley iteration family for multiple polynomial zeros. In this paper, we proposed a new Chebyshev-Halley type method for solving the matrix sign functions.
In addition, the applications obtained for the matrix sign functions are significant. For example, matrix sign function can be used as a tool for solving of the algebraic Riccati equation [6,7,8], Lyapunov matrix equations[9], generalized algebraic Bernoulli equations[10] and separation problem of matrix eigenvalues[11]. It is used as a valuable method to compute the matrix square root, the matrix pth roots and the polar decomposition[12]. Due to the applicability of the matrix sign function, stable iterative schemes with global convergence have become viable choices for computing this matrix. Basin of attractions can help us to obtain the iterative schemes with global convergence, for example, Soleymani et al. [13,14] imposed some high order iterative methods for solving matrix sign function with application in stochastic differential equations. Other high-order iterative methods with global convergent have been proposed for solving matrix sign function, see[15,16,17,18,19] and the references therein.
The sign of a matrix function generalizes the scalar sign, then the scalar sign function for any z∈C not lying on the imaginary axis is given by
Indeed, this function for the matrix case was introduced by Roberts[19] for solving the problem of model reduction and the algebraic Riccati equations.
We suppose that the matrix A∈Cn×n has no eigenvalues on the imaginary axis. If A=TJAT−1 is a Jordan canonical form arranged so that
where the eigenvalues of J1∈Cq×q and J2∈C(n−q)×(n−q) locate in the open left half-plane and the open right half-plane, respectively. Then, the matrix sign function is given by
Actually, for any positive integer p, the matrix p-sector function[20] can be defined by
where the matrix sign function S=A(A2)−1/2 is taken in the case p=2. In fact, S has various properties, as shown in[21].
The well known iteration for computation of S is Newton's method (NM)
We know that iteration (1.6) is by no means the only rational matrix method for computing S, as a lot of other iterations have been derived to get a higher order of convergence, and to improve the convergence acceleration. Hopefully, Laub and Kenney established the Padé family of iterations in [22]. For non-pure imaginary z∈C, we have the following characterization:
where ξ=1−z2, hence the task of approximating sign(z) leads to that of approximating
where ξ is less than 1 in magnitude. Let the (l,n)- Padé approximant to f(ξ) be
and l+n≥1. More precisely, Kenney and Laub set up the rational iterations of the form
where Pl,n/Ql,n denotes the (l,n)- Padé approximant to the function (1−ξ)−1/2. Obviously, for any suitable l and n, the matrix versions of the iterations are given by
where r=l+n+1. It has been proved that the iterations with l=n and l=n−1 are globally convergent, while those with l≥n+1 have local convergence. The convergence rate being l+n+1 in every case. Similarly, we can give the reciprocal Padé iterations as follows:
Table 1 gives the principal Padé iteration and its reciprocal for the order 4≤r≤10. PMr denotes the principal Padé iteration, while RPMr denotes its reciprocal Padé iteration.
In particular, we study the cases l=n−1 and l=n, which we call the principal Padé iterations[23]. They satisfy the equation
Note that gr(x)=pr(x)/qr(x), where qr(x) and pr(x) are the even and odd parts of (1+x)r, respectively. On the other hand, lots of known iterative methods are contained in the Padé family or its reciprocal. Newton-Schultz iteration (NSM) can be retrieved in the case of l=1 and n=0,
Choosing l=1 and n=1 yields the Halley's method (HM):
After a brief introduction, we are interested in constructing more efficient methods for solving the matrix sign function. We also focus on some intriguing properties for finding the matrix sign function, including higher order convergence, global convergence, stability and the efficiency. In Section 2, we propose a variant of Chebyshev-Halley family with a free parameter. The convergence order of new Chebyshev-Halley type family is eight. Fractal results show that some special cases of the new family have global convergence. Some special cases of the new family are not in the Padé family. This means that some new iterative methods are obtained for solving the matrix sign function. We theoretically prove that the new family is convergent and stable in Sections 3 and 4, respectively. In Section 5, we compare our method with the existing methods by numerical experiments. The proposed method is applied to a random matrix, Wilson matrix and continuous-time algebraic Riccati equation. Numerical results show the effectiveness of the proposed methods. Finally, Section 6 concludes the findings.
2.
A new Chebyshev-Halley type iterative method
Here, we give the nonlinear matrix equation below:
in which I is a unit matrix. Moreover, the sign S is a solution of (2.1).
In fact, before we obtain a new iterative method to solve the matrix equation X2−I=0, we should discuss two important problems about the new matrix iteration. The first is not in the Padé family or its reciprocal. Second, it must be globally convergent.
Now, we propose a new iterative method of Chebyshev-Halley type with a free parameter:
where a∈R, noting that
By inserting Eqs (2.2) and (2.3) into Eq (2.1), we attain iteration in the reciprocal form as follows:
We can achieve the method (2.4) for calculating the sign function. Note that, Xk(k≥0) are rational functions of A and, hence, like A, commute with S.
Theorem 1. Let f(x)=0 be a function around the simple root α. If the initial point x0 is sufficiently close to α, then the order of convergence for (2.2) is eight, for any value of parameter a, with the following error equation:
where cj=f(j)(α)j!f′(α), j≥2, and εk=xk−α.
Proof. The result is based on Taylor's series and symbolic computation in Mathematica [2], where this is skipped over.
Now, some different cases of the family (2.4) are given below.
If a=−1, we have (M1)
If a=−2, we have (M2)
If a=1, we have (M3)
If a=2, we have (M4)
If a=−1/2, we have (M5)
If a=1/2, we have (M6)
If a=0, we have (M7)
If a=3/4, we have (M8)
One might want to attain an efficient scheme for selecting values of the free parameter a to compute the matrix sign function. We remake that the proposed methods should satisfy two cases, that is, to possess global convergence and to not be in the general Padé family of iterations (1.10).
Now, we can observe the convergence behavior of the members of the family (2.4) by drawing the attraction basins (for more information see[15]). For such a case, the associated attraction basins in terms of some values of parameter a to solve the scalar equation x2−1=0 are presented. We already know that the attraction basins for the Newton-Schultz iteration (1.14) has local convergence, so we neglect to draw it.
Here, we take the domain Γ=[−2,2]×[−2,2]∈C. Each starting point z0∈Γ is allocated a color by the simple zero where the new scheme (2.4) converges. The point is painted in black when the method diverges. The stopping criterion is |f(xk)|≤10−3 in our programs. In addition, we set the maximum number of iterations to 30. Note that the roots are plotted in two white points.
First of all, in Figure 1, the attraction basins of NM and HM are shown. We also provide different kinds of attraction basins for a=±1, a=±1/2, a=±2, a=0 and a=3/4 in Figures 2–5, respectively. We can see that methods M1, M2, M4, M5 and M7 have local convergence. Methods M3, M6 and M8 corresponding to a=1, a=1/2, a=3/4 perform the global convergence. But, M3 and M6 are the members from the Padé family, which are not new iterative schemes. M8 does not belong to the member of the Padé family. So, method M8 is a new iterative scheme with global convergence. Indeed, our main aim in constructing iterative schemes for the matrix sign is to reach the convergence order as fast as possible, and also to minimize computational cost. Taking into account the difficulty of finding a general family of matrix sign iterations, we should consider other essential properties of iterations for calculating the sign of a matrix. The following part is the convergence analysis.
3.
Convergence analysis
Here, we will show a precise convergence analysis of the method (2.4) under some assumptions. The following theorem is done to provide insight into the analysis.
Theorem 2. Let A∈Cn×n have no pure imaginary eigenvalues. Then, the proposed iterates {Xk}k=∞k=0 of (2.4) converge to the matrix sign S, choosing X0 = A.
Proof. Let A have a Jordan canonical form arranged as
where T is a nonsingular matrix and C, N are the square Jordan blocks corresponding to eigenvalues lying in C− and C+, respectively. Denote by λ1, ..., λq and λq+1, ..., λn values lying on the main diagonals of blocks C and N, respectively.
Of course, recall that
Thus,
Furthermore, we give the definition of Dk=T−1XkT, and it follows from (2.4) that
Notice that if D0 is a diagonal matrix, then using inductive proof, all successive Dk are diagonal matrices as well.
From the Eq (3.4), we can see that {Dk} converges to sign(Λ). By re-arranging (3.4) as n uncoupled scalar iterations to solve x2−1=0, we can derive the following equation
where dik=(Dk)i,i and 1≤i≤n. Similarly, for all 1≤i≤n, application of (3.4) and (3.5) lead to the convergence of {dik} to sign(λi).
Since the eigenvalues of A are not pure imaginary values and from (3.5), we have sign(λi)=si=±1. Therefore, it follows that
Since |di0|=|λi|>0 and |di0−1di0+1|<1, thus we have limk→∞|dik+1−1dik+1+1|=0, and limk→∞|dik|=1=|sign(λi)|. This shows that {dik} is convergent. Then, we can obtain that limk→∞Dk=sign(Λ). Recalling Dk=T−1XkT, we have
Finally, this finishes the proof.
Clearly, Xk are rational functions of A and therefore, similar to A, commute with S. Note that we know that S−1=S, S2=I, S2j+1=S, and S2j=I, j≥1. Hence, it is easy to show that the method (2.4) reads the following error inequality
where Bk=(1−2a)2I+(−11+4a+52a2)X2k+(−14+280a−56a2)X4k+(322−56a−56a2)X6k +(205−212a+52a2)X8k+(9−12a+4a2)X10k. The inequality (3.8) reveals the eighth order of convergence.
4.
Stability
This section begins with analysis of the stability of (2.4). We consider to prove the stability of (2.4) for finding S in a neighborhood of the solution. Take into account that how a small perturbation at the kth iterate is amplified or damped along the iterates.
Theorem 3. Let A∈Cn×n have no pure imaginary eigenvalues. The sequence {Xk}k=∞k=0 generated by (2.4) is asymptotically stable.
Proof. If X0 is a function of A, then the iterations of (2.4) are all functions of A and, thus, commute with A. Let △Xk be the numerical perturbation introduced at the kth iterate of (2.4). Then, we have
Here, we formally use approximations (△Xk)i≈0, since (△Xk)i≈0, i≥2, is small. For small △Xk, we can neglect the value. In this case, we get
By using the identity for any nonsingular matrix B and C [2]
the Eq (4.2) can be written as
Further, we apply △Xk+1=˜Xk+1−Xk+1=˜Xk+1−S, and it is easy to show that
We can know that the perturbation at the (k+1)st iteration is bounded; that is, we have
To summarize, the sequence {Xk}k=∞k=0 generated by (2.4) is stable. The proof is ended.
5.
Numerical experiments
This section describes some numerical experiments that demonstrate the effectiveness of the proposed method. On the other hand, we now only use iterative methods with global convergence. In the following tests, the compared schemes are NM, HM, PM4, M3, M6 and M8.
For a fairer comparison, all computations are performed on same laptop equipped with Core i5, 7th generation CPU. The stopping termination ‖X2k−I‖2≤10−4 is used in Tables 2 and 3. We have examined each method of 9 randomly generated matrices. The numerical results for different methods on random matrices of sizes 5×5, 10×10, 20×20, 50×50,100×100, 150×150,200×200,250×250 and 300×300 are given in Tables 2 and 3. Finally, we have recorded their average in the last line.
We study the behavior of different methods for finding the well-known Wilson matrix:
where we use the stopping termination ‖X2k−I‖2≤10−5 in Figure 6. We can see that the results show a stable behavior of the proposed iterative method for finding S.
Now, we investigate the problem of solving the algebraic Riccati equation (ARE) with matrix sign functions. Consider the continuous-time and discrete-time algebraic Riccati equation of the forms
where the quantities A∈Rn×n, Q=QT∈Rn×n and B∈Rn×n are positive semi-definite, X∈Rn×n is the unknown matrix, R=RT∈Cn×n is positive definite. The solutions of ARE are closely related to Symplectic matrices and Hamiltonian matrices[8].
First of all, we introduce the continuous-time algebraic Riccati equation. The stabilizing solution of the ARE (5.2) is a real matrix X for which all eigenvalues of A−BR−1BTX have negative real part and, therefore,
Moreover,
is Hamiltonian matrices. Thus, we have
for a suitable matrix K.
Now, we have X in the following form
and, thus,
which implies
Once the sign of H is calculated, we can solve the overdetermined system (5.9) by using the standard algorithm to get the required solution. Note that the matrix H can not have eigenvalues on the imaginary axis.
For the algebraic Riccati Eq (5.3), we have the Symplectic matrices M as follows:
The details are not explained here, see more information in [8].
In this test, we use the stopping termination ‖X2k−I‖∞≤10−8. We apply the proposed scheme M8 to solve the ARE with the following matrices
We get
By using method M8 (leaving 4 decimal places), it can be seen that
The resulting matrix, which is the solution of (5.3), would be
From the approximate solution (5.15), we know that method M8 is valid for solving the algebraic Riccati equation. In Table 4, the speed of solving the Riccati equation is further improved by using the M8 method. In Tables 2 and 3, essentially, in terms of the number of iterations and the computational CPU time, they imply that M8 has the best performance in general. We can demonstrate that the proposed method affirms the theoretical parts from the results. From numerical experiments, the proposed method presents consistent convergence behavior.
6.
Conclusions
In this paper, we propose a new family of Chebyshev-Halley type iterative method (2.4) with eighth-order. We theoretically prove that the new method (2.4) is convergent and asymptotically stable. Method M8 is a special case of method (2.4) with parameter a=34, which does not belong to the member of the Padé family. So, method M8 is a new iterative scheme for solving the matrix sign function. Attraction basins in Figures 1–5 are performed to show the convergence behaviors of different methods. From Figure 5, we know that method M8 is globally convergent. Method M8 is applied to a random matrix, Wilson matrix and continuous-time algebraic Riccati equation. Numerical results show that method M8 costs less computing time and requires less number of iterations. This means that method M8 has good convergence behavior.
Acknowledgments
This research was supported by the National Natural Science Foundation of China (No. 61976027), the Natural Science Foundation of Liaoning Province (Nos. 2022-MS-371, 2019-ZD-0502), Educational Commission Foundation of Liaoning Province of China (Nos. LJKMZ20221492, LJKMZ20221498) and the Key Project of Bohai University (No. 0522xn078).
Conflict of interest
The authors declare no conflict of interest.