Research article

Two accelerated gradient-based iteration methods for solving the Sylvester matrix equation AX + XB = C

  • Received: 25 September 2024 Revised: 17 November 2024 Accepted: 29 November 2024 Published: 12 December 2024
  • MSC : 15A24, 65F30

  • In this paper, combining the precondition technique and momentum item with the gradient-based iteration algorithm, two accelerated iteration algorithms are presented for solving the Sylvester matrix equation AX+XB=C. Sufficient conditions to guarantee the convergence properties of the proposed algorithms are analyzed in detail. Varying the parameters of these algorithms in each iteration, the corresponding adaptive iteration algorithms are also provided, and the adaptive parameters can be explicitly obtained by the minimum residual technique. Several numerical examples are implemented to illustrate the effectiveness of the proposed algorithms.

    Citation: Huiling Wang, Nian-Ci Wu, Yufeng Nie. Two accelerated gradient-based iteration methods for solving the Sylvester matrix equation AX + XB = C[J]. AIMS Mathematics, 2024, 9(12): 34734-34752. doi: 10.3934/math.20241654

    Related Papers:

    [1] Siting Yu, Jingjing Peng, Zengao Tang, Zhenyun Peng . Iterative methods to solve the constrained Sylvester equation. AIMS Mathematics, 2023, 8(9): 21531-21553. doi: 10.3934/math.20231097
    [2] Nunthakarn Boonruangkan, Pattrawut Chansangiam . Convergence analysis of a gradient iterative algorithm with optimal convergence factor for a generalized Sylvester-transpose matrix equation. AIMS Mathematics, 2021, 6(8): 8477-8496. doi: 10.3934/math.2021492
    [3] Jin-Song Xiong . Generalized accelerated AOR splitting iterative method for generalized saddle point problems. AIMS Mathematics, 2022, 7(5): 7625-7641. doi: 10.3934/math.2022428
    [4] Jiaxin Lan, Jingpin Huang, Yun Wang . An E-extra iteration method for solving reduced biquaternion matrix equation AX+XB=C. AIMS Mathematics, 2024, 9(7): 17578-17589. doi: 10.3934/math.2024854
    [5] Kanjanaporn Tansri, Pattrawut Chansangiam . Gradient-descent iterative algorithm for solving exact and weighted least-squares solutions of rectangular linear systems. AIMS Mathematics, 2023, 8(5): 11781-11798. doi: 10.3934/math.2023596
    [6] Yinlan Chen, Min Zeng, Ranran Fan, Yongxin Yuan . The solutions of two classes of dual matrix equations. AIMS Mathematics, 2023, 8(10): 23016-23031. doi: 10.3934/math.20231171
    [7] Wenxiu Guo, Xiaoping Lu, Hua Zheng . A two-step iteration method for solving vertical nonlinear complementarity problems. AIMS Mathematics, 2024, 9(6): 14358-14375. doi: 10.3934/math.2024698
    [8] Wen-Ning Sun, Mei Qin . On maximum residual block Kaczmarz method for solving large consistent linear systems. AIMS Mathematics, 2024, 9(12): 33843-33860. doi: 10.3934/math.20241614
    [9] Kanjanaporn Tansri, Sarawanee Choomklang, Pattrawut Chansangiam . Conjugate gradient algorithm for consistent generalized Sylvester-transpose matrix equations. AIMS Mathematics, 2022, 7(4): 5386-5407. doi: 10.3934/math.2022299
    [10] Yang Cao, Quan Shi, Sen-Lai Zhu . A relaxed generalized Newton iteration method for generalized absolute value equations. AIMS Mathematics, 2021, 6(2): 1258-1275. doi: 10.3934/math.2021078
  • In this paper, combining the precondition technique and momentum item with the gradient-based iteration algorithm, two accelerated iteration algorithms are presented for solving the Sylvester matrix equation AX+XB=C. Sufficient conditions to guarantee the convergence properties of the proposed algorithms are analyzed in detail. Varying the parameters of these algorithms in each iteration, the corresponding adaptive iteration algorithms are also provided, and the adaptive parameters can be explicitly obtained by the minimum residual technique. Several numerical examples are implemented to illustrate the effectiveness of the proposed algorithms.



    In this paper, we consider the iterative solution of the following Sylvester matrix equation:

    AX+XB=C, (1.1)

    where ARm×m, BRn×n, CRm×n are constant matrices and XRm×n is the unknown matrix to be obtained.

    Due to the extensive applications of Eq (1.1) in control theory and stability analysis [1,10,14], it has garnered considerable attention, and many algorithms have been proposed over the past few decades. For example, the gradient-based iteration (GI) algorithm described in [6,8,9] has proven to be an effective method for solving Eq (1.1). By incorporating a tunable parameter into the GI algorithm, a relaxed gradient-based iteration (RGI) algorithm [18] was introduced, which demonstrates improved performance over the GI algorithm when the relaxed parameter is appropriately adopted. To enhance the RGI algorithm's efficiency, an accelerated gradient-based iteration (AGBI) algorithm was proposed by leveraging the latest information from the preceding half-step in [25]. In order to achieve a lower computational cost, a Jacobi gradient iteration (JGI) method was outlined in [13] based on the Jacobi splitting of A and B. Drawing inspiration from the AGBI and JGI algorithms, Tian et al. [21] further developed an accelerated JGI (AJGI) algorithm. Additionally, various other iteration algorithms [7,20,22] have been devised for solving Eq (1.1) and other related matrix equations [17,23,24,29], because of its wide applications.

    Preconditioning techniques aim to alter the spectral characteristics of matrices through linear transformations, which are often integrated with other iteration methods and lead to various new algorithms such as the preconditioned HSS method [2,16], generalized preconditioned HSS methods [28], and preconditioned MHSS iteration methods [4], etc. The heavy-ball momentum method is widely applied to accelerate the convergence rate of the gradient method [5,19]. In this paper, inspired by the references [2,5,19], we combine the precondition technique and the momentum item with the gradient-based iteration algorithm, and the specific work can be summarized as follows:

    (a) Novel Methodology. We have developed the preconditioned gradient-based iteration (PGI) and gradient-based momentum iteration (GMI) algorithms for solving Eq (1.1), which are more efficient than existing methods in terms of computational complexity and accuracy.

    (b) Theoretical Insights. Our work provides new theoretical insights into gradient-based iteration algorithms. The convergences of PGI and GMI algorithms are rigorously proved.

    (c) Adaptive Parameter Selection. We have developed a new parameter selection strategy that minimizes the current residual norm, leading to improved performance of the proposed algorithms. This strategy is practical and can be easily implemented in various numerical algorithms, enhancing their efficiency and accuracy.

    (d) Empirical Results. Through extensive numerical experiments, we have shown that our methods outperform current state-of-the-art techniques in the solving of Eq (1.1).

    The remainder of this paper is organized as follows: In Section 2, we first review the GI algorithm, and present the PGI and GMI algorithms, whose convergence properties are analyzed in detail. In Section 3, we construct the adaptive PGI and GMI algorithms in which the parameters are updated by utilizing the iterative information. In Section 4, several numerical examples are employed to show the robustness and efficiencies of the proposed algorithms. Finally, some conclusions are drawn in the last section.

    In this section, we first review the GI algorithm. Subsequently, two accelerated GI algorithms are presented, and detailed analyses are conducted on their convergence properties. In the following, several lemmas are given, which will be used in the subsequent proofs.

    Lemma 2.1. [12] Let ARm×n,BRp×q, and

    R(A,B):={MRn×p| ZRm×q,s.t. M=ATZBT}.

    For any matrix MR(A,B), it holds that

    AMB2Fσ2min(A)σ2min(B)M2F,

    where σmin(A) and σmin(B) are the smallest singular values of the matrices A and B, respectively.

    Lemma 2.2. [11,15] Both roots of the real quadratic equation x2bx+c=0 are less than one in modulus if and only if |c|<1 and |b|<1+c.

    By utilizing the hierarchical identification principle, Eq (1.1) can be reformulated into two subsystems as follows:

    AX=CXB,  XB=CAX. (2.1)

    The GI algorithm for solving (2.1) can be described as follows:

    It is shown that the GI algorithm [8] converges when

    0<μ<2λmax(AAT)+λmax(BTB),

    where λmax(AAT) and λmax(BTB) are the largest eigenvalues of AAT and BTB, respectively.

    By introducing two preconditioners, P and Q, in Algorithm 1, a preconditioned gradient-based iterative (i.e., PGI) algorithm is constructed and summarized as follows:

    Algorithm 1 The GI algorithm [8].
    Require: Given an initial approximate matrix X(0) and the parameter μ
    Ensure: X(k)
    1: For k=1,2,, until converges, do
    2:  X(k)1=X(k1)+μAT[CAX(k1)X(k1)B],
    3:  X(k)2=X(k1)+μ[CAX(k1)X(k1)B]BT,
    4:  X(k)=[X(k)1+X(k)2]/2.
    5: End

    Remark 1. If P=Im and Q=In are adopted, the PGI iteration method is reduced to the original GI algorithm in [8], where Is is an identity matrix with size s.

    Remark 2. Two practical choices of the matrices P and Q are listed as follows:

    1) P=diag(A),Q=diag(B), where diag(A) and diag(B) are the diagonal matrices of A and B, respectively.

    2) P=tridiag(ATA),Q=tridiag(BBT), where tridiag(ATA) and tridiag(BBT) are the tridiagonal matrices of ATA and BBT, respectively.

    Theorem 2.1. Let X be the solution of Eq (1.1). The iterative solution X(k) generated by Algorithm 2 converges to X for any initial value if and only if the parameter μ satisfies the condition

    2Imnμ(IP1ATA+BTP1AT+QTBA+QTBBTI)2<2, (2.2)

    Algorithm 2 The PGI algorithm.
    Require: Given an initial matrix X(0), two preconditioners P and Q, and the parameter μ
    Ensure: X(k)
    1: For k=1,2,, until converges, do
    2:  X(k)1=X(k1)+μP1AT[CAX(k1)X(k1)B],
    3:  X(k)2=X(k1)+μ[CAX(k1)X(k1)B]BTQ1,
    4:  X(k)=[X(k)1+X(k)2]/2.
    5: End

    where 2 is the 2-norm of the matrix.

    Proof: For k=1,2,, define the kth error matrices ˜X(k):=X(k)X, which satisfy the following recurrence:

    ˜X(k)=˜X(k1)μ2P1ATA˜X(k1)μ2P1AT˜X(k1)Bμ2A˜X(k1)BTQ1μ2˜X(k1)BBTQ1. (2.3)

    By using the Kronecker product [26], the above equation can be reformulated by

    vec(˜X(k))=vec(˜X(k1))μ2(InP1ATA)vec(˜X(k1))μ2(BTP1AT)vec(˜X(k1))μ2(QTBA)vec(˜X(k1))μ2(QTBBTIm)vec(˜X(k1)).

    Taking the 2-norm of vec(˜X(k)), it follows that

    vec(˜X(k))2ηvec(˜X(k1))2,

    where

    η=122Imnμ(InP1ATA+BTP1AT+QTBA+QTBBTIm)2.

    Thus,

    vec(˜X(k))2ηvec(˜X(k1))2ηkvec(˜X(0))2.

    If μ satisfies (2.2), we know that ˜X(k)0 as k. The proof is completed.

    In order to make full use of the information from the previous iteration step, a momentum term will be added to the GI algorithm, and then the second accelerated GI (i.e., GMI) algorithm will be proposed and summarized as follows:

    Remark 3. If β is chosen to be 0, then the GMI algorithm is just the GI algorithm.

    Theorem 2.2. Assume that the matrices A and B are non-singular. If Eq (1.1) has a unique solution X, then the iterative solution X(k) obtained from Algorithm 3 converges to X for any initial values if and only if the parameters μ and β satisfy the following conditions:

    Algorithm 3 The GMI algorithm.
    Require: Given two initial matrices X(0) and X(1), and two parameters μ and β
    Ensure: X(k)
    1: For k=2,3,4, , until converges, do
    2:  X(k)1=X(k1)+μAT[CAX(k1)X(k1)B],
    3:  X(k)2=X(k1)+μ[CAX(k1)X(k1)B]BT,
    4:  X(k)=[X(k)1+X(k)2]/2+β(X(k1)X(k2)).
    5: End

    Case 1: When  3q2q21>0,

    {q212q24q2β<a,q1cq2<μ<q1+cq2, (2.4)

    or

    {0<β<b,q1cq2<μq1dq2orq1+dq2μ<q1+cq2, (2.5)

    where q1=σ2min(A)+σ2min(B)2A2B2, q2=(A22+B22)(A2+B2)2, a=min{12,q21q28q2}, b=min{12,q212q24q2}, c=q21q2(8β2+1) and d=q21q2(4β2+2).

    Case 2: When  3q2q210,

    {0<β<a,q1cq2<μq1dq2orq1+dq2μ<q1+cq2. (2.6)

    Proof: Define the error matrices

    ˜X(k)1:=X(k)1X,  ˜X(k)2:=X(k)2X,  ˜X(k):=X(k)X,  k=1,2,.

    From Algorithm 3, it follows that

    {˜X(k)1=˜X(k1)+μAT[A˜X(k1)˜X(k1)B],˜X(k)2=˜X(k1)+μ[A˜X(k1)˜X(k1)B]BT. (2.7)

    Taking the F-norm on (2.7), it yields that

    ˜X(k)12F=˜X(k1)2F+2μtr{(A˜X(k1))T[A˜X(k1)˜X(k1)B]}+μ2AT[A˜X(k1)˜X(k1)B]2F˜X(k1)2F+2μtr{(A˜X(k1))T[A˜X(k1)˜X(k1)B]}+μ2A22A˜X(k1)+˜X(k1)B2F, (2.8)

    and

    ˜X(k)22F˜X(k1)2F+2μtr{[A˜X(k1)˜X(k1)B](˜X(k1)B)T}+μ2B22A˜X(k1)+˜X(k1)B2F. (2.9)

    By the triangle inequality and the property of the F-norm, we have

    A˜X(k1)F˜X(k1)BFA˜X(k1)+˜X(k1)BFA˜X(k1)F+˜X(k1)BF(A2+B2)˜X(k1)F,

    or

    ˜X(k1)BFA˜X(k1)FA˜X(k1)+˜X(k1)BF(A2+B2)˜X(k1)F.

    Squaring both sides, we obtain:

    A˜X(k1)2F+˜X(k1)B2F2A˜X(k1)F˜X(k1)BFA˜X(k1)+˜X(k1)B2F(A2+B2)2˜X(k1)2F.

    According to Lemma 2.1, we have

    (σ2min(A)+σ2min(B)2A2B2)˜X(k1)2FA˜X(k1)+˜X(k1)B2F(A2+B2)2˜X(k1)2F. (2.10)

    Combining (2.8)–(2.10), it yields that

    ˜X(k)2F=˜X(k)1+˜X(k)22+β(˜X(k1)˜X(k2))2F2˜X(k)1+˜X(k)222F+2β2˜X(k1)˜X(k2)2F˜X(k)12F+˜X(k)22F+2β2˜X(k1)˜X(k2)2F2˜X(k1)2F2μA˜X(k1)+˜X(k1)B2F+μ2(A22+B22)A˜X(k1)+˜X(k1)B2F+2β2˜X(k1)˜X(k2)2F2˜X(k1)2F2μ(σ2min(A)+σ2min(B)2A2B2)˜X(k1)2F+μ2(A22+B22)(A2+B2)2˜X(k1)2F+4β2(˜X(k1)2F+˜X(k2)2F)=[22μ(σ2min(A)+σ2min(B)2A2B2)+μ2(A22+B22)(A2+B2)2]˜X(k1)2F+4β2˜X(k1)2F+4β2˜X(k2)2F. (2.11)

    By (2.11), we have

    [˜X(k)2F˜X(k1)2F]H[˜X(k1)2F˜X(k2)2F]Hk1[˜X(1)2F˜X(0)2F],

    where

    H=[q2μ22q1μ+2+4β24β210].

    Let λ be the eigenvalue of the matrix H. We know that

    λ2λ(q2μ22q1μ+2+4β2)4β2=0.

    It then follows from Lemma 2.2 that |λ|<1 if and only if

    {4β2<1,|q2μ22q1μ+2+4β2|<14β2,

    which implies that

    {0<β<12,1+4β2<q2μ22q1μ+2+4β2<14β2. (2.12)

    In addition, H0 (H0, if hij0 holds for all 1<i<2, 1<j<2.) if and only if

    q2μ22q1μ+2+4β20. (2.13)

    Together with (2.12) and (2.13), we have

    {0<β<12,0q2μ22q1μ+2+4β2<14β2. (2.14)

    In the following, we mainly solve the inequalities (2.14) to obtain the range of μ and β. The second inequality of (2.14) is equivalent to

    {q2μ22q1μ+8β2+1<0,q2μ22q1μ+4β2+20. (2.15)

    Consider (2.15) as a system of quadratic inequalities in terms of μ, and determine the range of μ to make the inequalities hold. Let's first solve the first inequality of (2.15). When Δ1=4q214q2(8β2+1)>0, i.e., 0<β<q21q28q2, there are two solutions q1q21q2(8β2+1)q2 and q1+q21q2(8β2+1)q2 for the quadratic equation q2μ22q1μ+8β2+1=0. So the solution of the first inequality in (2.15) is

    q1q21q2(8β2+1)q2<μ<q1+q21q2(8β2+1)q2.

    When Δ10, q2μ22q1μ+8β2+1 is always greater than or equal to 0. So the inequality of q2μ22q1μ+8β2+1<0 has no solution.

    Solving the the second inequality of (2.15) by the same method in the following. When Δ2=4q214q2(4β2+2)0, i.e., βq212q24q2, q2μ22q1μ+4β2+2 is always greater than or equal to 0. So the solution of the second inequality of (2.15) is μR; when Δ2>0, i.e., 0<β<q212q24q2, the solution of the second inequality of (2.15) is

    μq1q21q2(4β2+2)q2orμq1+q21q2(4β2+2)q2.

    In order to find the solution for (2.15), we need to consider the following cases:

    Case 1: If 3q2q21>0, then

    {q212q24q2β<q21q28q2,q1q21q2(8β2+1)q2<μ<q1+q21q2(8β2+1)q2, (2.16)

    or

    {0<β<q212q24q2,q1q21q2(8β2+1)q2<μq1q21q2(4β2+2)q2orq1+q21q2(4β2+2)q2μ<q1+q21q2(8β2+1)q2. (2.17)

    Case 2: If 3q2q210, then

    {0<β<q21q28q2,q1q21q2(8β2+1)q2<μq1q21q2(4β2+2)q2orq1+q21q2(4β2+2)q2μ<q1+q21q2(8β2+1)q2. (2.18)

    Together with the first inequality of (2.14) and (2.16)–(2.18), (2.4)–(2.6) are obtained. Thus, the proof is completed.

    Remark 4. When the error iteration matrix of the proposed algorithm is of size 2×2, the idea of using Lemma 2.2 to determine the range of the coefficients in quadratic equations, thereby ensuring the convergence of the algorithm, has been widely applied in many literatures. For example, the SOR-like methods for solving the absolute value equations [11,15].

    In this section, explicitly giving the varied parameters in the proposed algorithms by minimizing the residual in every iteration, the PGI and GMI algorithms with adaptive parameters are constructed.

    We first present the calculation rule for the parameter used in Algorithm 2 by minimizing the current residual norm. The details are described below:

    Suppose the parameter μk is taken in Algorithm 2 and for k=1,2,3,, the previous k1 residuals are defined by

    R(k1)=CAX(k1)X(k1)B, (3.1)

    then the PGI algorithm can be simply rewritten as

    X(k)=X(k1)+μk2P1ATR(k1)+μk2R(k1)BTQ1. (3.2)

    According to (3.1) and (3.2), the kth residual is given by

    R(k)=R(k1)μk2M(k1) (3.3)

    with M(k1)=P1ATR(k1)B+R(k1)BTQ1B+AP1ATR(k1)+AR(k1)BTQ1.

    Taking the F-norm on both sides of (3.3), it holds that

    R(k)2F=tr[(R(k1)μk2M(k1))T(R(k1)μk2M(k1))]=R(k1)2Fμktr((M(k1))TR(k1))+μ2k4M(k1)2F.

    Let ϕ(μk)=R(k)2F. The first-order derivative of ϕ(μk) yields

    ϕμk=μk2M(k1)2Ftr((M(k1))TR(k1)).

    It is easy to see that the unique stationary point of the function ϕ(μk) is

    μk=2tr((M(k1))TR(k1))M(k1)2F. (3.4)

    It is obvious that the sencond-order derivative of ϕ(μk) i.e., 2ϕμ2k=12M(k1)2F>0, which implies that the stationary point mentioned in (3.4) is the unique minimum point of the function ϕ(μk).

    Through the above arrangement, we formally outline the APGI method in Algorithm 4.

    Algorithm 4 The APGI algorithm.
    Require: Given two preconditioners P and Q, an initial matrix X(0) and the parameter μ1
    Ensure: X(k)
    1: For k=1,2,, until converges, do
    2:  X(k)1=X(k1)+μkP1AT[CAX(k1)X(k1)B],
    3:  X(k)2=X(k1)+μk[CAX(k1)X(k1)B]BTQ1,
    4:  X(k)=[X(k)1+X(k)2]/2,
    5: according to (3.4), compute μk+1.
    6: End

    Remark 5. If P=Im and Q=In are adopted, the APGI algorithm is reduced to AGI algorithm.

    The kth iteration of the GMI algorithm can be simply rewritten as:

    X(k)=X(k1)+μk2ATR(k1)+μk2R(k1)BT+βk(X(k1)X(k2)).

    The residual of the kth iteration is

    R(k)=R(k1)μk2M(k1)+βkN(k1) (3.5)

    with M(k1)=ATR(k1)B+R(k1)BTB+AATR(k1)+AR(k1)BT, and N(k1)=R(k1)R(k2). Taking the F-norm on both sides of (3.5), it follows that

    R(k)2F=tr[(R(k1)μk2M(k1)+βkN(k1))T(R(k1)μk2M(k1)+βkN(k1))]=R(k1)2Fμktr((M(k1))TR(k1))+2βktr((N(k1))TR(k1))βkμktr((M(k1))TN(k1))+μ2k4M(k1)2F+β2kN(k1)2F.

    Let ψ(μk,βk)=R(k)2F. The first-order derivative of ψ(μk,βk) yields

    {ψμk=μk2M(k1)2Ftr((M(k1))TR(k1))βktr((M(k1))TN(k1)),ψβk=2βkN(k1)2F+2tr((N(k1))TR(k1))μktr((M(k1))TN(k1)).

    It is easy to see that the unique stationary point of the function ψ(μk,βk) is

    {μk=2ak1ek12bk1ck1dk1ek1b2k1,βk=bk1ak1ck1dk1dk1ek1b2k1, (3.6)

    where ak1=tr((M(k1))TR(k1)), bk1=tr((M(k1))TN(k1)), ck1=tr((N(k1))TR(k1)), dk1=M(k1)2F, ek1=N(k1)2F. We also know that the sencond-order derivative of the function ψ(μk,βk) is

    2ψμ2k=12M(k1)2F,   2ψμkβk=2ψβkμk=tr((M(k1))TN(k1)),   2ψβ2k=2N(k1)2F,

    whose Hessian matrix of the function ψ(μk,βk) at the stationary point is

    (12M(k1)2Ftr((M(k1))TN(k1))tr((M(k1))TN(k1))2N(k1)2F).

    It is obvious that the Hessian matrix is symmetric positive definite, which implies that the stationary point mentioned in (3.6) is the unique minimum point of the function ψ(μk,βk).

    According to the above explanation, the AGMI algorithm is formulated as described in Algorithm 5.

    Algorithm 5 The AGMI algorithm.
    Require: Given two initial approximate matrices X(0) and X(1), and two parameters μ2 and β2
    Ensure: X(k)
    1: For k=2,3,4 , until converges, do
    2:  X(k)1=X(k1)+μkAT[CAX(k1)X(k1)B],
    3:  X(k)2=X(k1)+μk[CAX(k1)X(k1)B]BT,
    4:  X(k)=[X(k)1+X(k)2]/2+βk(X(k1)X(k2)),
    5: according to (3.6), compute μk+1 and βk+1.
    6: End

    In this section, some examples are illustrated to verify the efficiencies of the proposed PGI, GMI, APGI, and AGMI algorithms compared with the GI [8], RGI [18], AGBI [25], AJGI [21], HSS [3], and NPHSS [16] algorithms. All examples are performed under MATLAB on a personal computer with a 1.61 GHz central processing unit (Intel(R) Core(TM) i7-10710) and 16GB memory.

    The number of the iteration steps (denoted by IT), the computing time in seconds (denoted by CPU), and the relative residual norm (denoted by RRN) are listed in the tables below. All the initial matrices are set to be zero matrices, and the iterations are stopped if the RRN in the current step satisfies

    RRN:=CAX(k)X(k)BCAX(0)X(0)B106,

    or the numbers of iteration steps exceeds 10000.

    Example 1. Consider Eq (1.1) with the matrices A and B defined by

    {A=diag(1,2,,n)+rLT,B=2tIn+diag(1,2,,n)+rLT+2tL

    with L is the strictly lower triangular matrix having ones in the lower triangle part, r=2 and t=12. The right-hand side C=AX+XB with X=ones(n), where ones is a MATLAB built-in function.

    The numerical results of the tested algorithms for Example 1 are listed in Table 2, and the corresponding error convergence curves are shown in Figure 1. The matrices P and Q in the PGI and APGI algorithms are the identity matrices, so the PGI and APGI algorithms are just the GI and AGI algorithms. The parameters in AGBI (Algorithm 2.4 in [25]), GI ((13)–(15) in [8]), RGI (Algorithm 1 in [18]), and GMI algorithms are experimentally optimal, which are denoted by μexp and βexp in Table 1. Like [18] and [25], the relaxation parameters in RGI and AJBI algorithms are both 0.5. Figure 1 shows the RRN of the AGBI, GI, AGI, GMI, and AGMI algorithms with n=200. From the figure, it can be observed that the AGMI algorithm performs best among all these algorithms.

    Figure 1.  Convergence curves of different algorithms for Example 1.
    Table 1.  The experimental optimal parameters for Example 1.
    Algorithms 100 200 300 400
    GI μexp 9.713E-06 2.424E-06 1.077E-06 6.057E-07
    RGI μexp 2.356E-05 5.879E-06 2.612E-06 2.120E-06
    AGBI μexp 0.390E-04 0.901E-05 3.790E-06 8.500E-06
    GMI μexp 2.428E-05 6.062E-06 2.692E-06 1.514E-06
    βexp 6.000E-01 6.000E-01 6.000E-01 6.000E-01

     | Show Table
    DownLoad: CSV
    Table 2.  Numerical results of different algorithms for Example 1.
    Algorithms 100 200 300 400
    GI IT 5413 5235 5174 5142
    CPU 3.349 14.091 39.421 116.627
    RRN 9.997E-07 9.999E-07 9.997E-07 9.999E-07
    RGI IT 4464 4318 4267 4241
    CPU 2.905 11.317 39.024 105.797
    RRN 9.997E-07 9.998E-07 9.998E-07 9.999E-07
    AGBI IT 2772 2879 2992 2985
    CPU 2.218 10.852 34.857 88.261
    RRN 9.996E-07 9.998E-07 9.995E-07 9.997E-07
    AGI IT 1681 1627 1608 1598
    CPU 2.212 9.114 31.939 86.779
    RRN 9.996E-07 9.992E-07 9.992E-07 9.995E-07
    GMI IT 864 836 826 821
    CPU 0.371 1.504 5.739 13.950
    RRN 9.993E-07 9.986E-07 9.988E-07 9.987E-07
    AGMI IT 94 93 92 91
    CPU 0.162 0.622 2.135 5.891
    RRN 9.785E-07 9.744E-07 9.753E-07 9.948E-07

     | Show Table
    DownLoad: CSV

    From Table 2, compared with the GI, RGI, AGBI, and AGI algorithms, the GMI and AGMI algorithms have more effectiveness in terms of IT and CPU time. In addition, since the parameters μk and βk in AGI and AGMI algorithms are varied and adaptive in each iteration, the two algorithms are more efficient than the GI and GMI algorithms, respectively.

    Example 2. The matrices A and B in Eq (1.1) are given as

    A=(10111121011112101111111210),B=(8111138111138111111138).

    The right-hand side C=AX+XB with X=ones(n).

    The numerical results of the tested algorithms for Example 2 are listed in Table 3, and the corresponding error convergence curves are shown in Figure 2. The matrices P and Q in the PGI and APGI algorithms are the diagonal parts of the matrices A and B, respectively. The experimentally optimal parameters contained in GI (see (13)–(15) in [8]), HSS ("The HSS Iteration Method" in [3]), NPHSS (Algorithm 1 in [16]), PGI, and GMI algorithms are given in Table 4. From Table 3, it is clear that the seven algorithms are convergent for all cases. Furthermore, the APGI and AGMI algorithms need fewer iteration numbers and CPU time than other algorithms, and the AGMI algorithm performs best among the seven algorithms. Figure 2 illustrates the convergence curves of the HSS, NPHSS, PGI, GMI, APGI, and AGMI algorithms for the case n=256. It is clear that the RRN of the AGMI and NPHSS algorithms rapidly decreases below 1013, but NPHSS needs much more CPU time. So, AGMI is the most efficient one.

    Table 3.  Numerical results of different algorithms for Example 2.
    Algorithms 128 256 512 1024
    GI IT 43 38 35 31
    CPU 0.047 0.213 1.216 7.249
    RRN 9.362E-07 9.739E-07 9.743E-07 9.345E-07
    HSS IT 6 6 5 4
    CPU 0.344 1.817 8.361 46.524
    RRN 1.917E-07 4.749E-07 6.788E-07 6.678E-07
    NPHSS IT 5 5 4 3
    CPU 0.054 0.321 2.154 14.542
    RRN 6.137E-07 4.381E-08 3.839E-08 4.305E-07
    PGI IT 17 15 13 12
    CPU 0.026 0.106 0.588 3.919
    RRN 6.848E-07 7.719E-07 6.914E-07 6.468E-07
    GMI IT 22 18 19 18
    CPU 0.021 0.069 0.474 2.995
    RRN 8.216E-07 8.211E-07 7.853E-07 8.068E-07
    APGI IT 4 4 3 3
    CPU 0.023 0.084 0.337 2.365
    RRN 9.739E-07 1.733E-07 4.279E-07 1.525E-07
    AGMI IT 3 3 3 3
    CPU 0.029 0.064 0.247 1.961
    RRN 1.228E-07 1.204E-08 4.862E-09 1.592E-09

     | Show Table
    DownLoad: CSV
    Figure 2.  Convergence curves of different algorithms for Example 2.
    Table 4.  The experimental optimal parameters for Example 2.
    Algorithms 128 256 512 1024
    GI μexp 1.323E-05 3.547E-06 8.273E-07 1.872E-07
    HSS α1 1.329E+02 3.229E+02 6.462E+02 1.091E+03
    α2 1.046E+02 2.548E+02 5.104E+02 8.624E+02
    NPHSS α1 2.248E+00 2.499E+00 1.999E+00 2.125E+00
    α2 1.439E+01 1.599E+01 1.279E+01 1.359E+01
    PGI μexp 3.059E-04 8.201E-05 2.125E-05 5.409E-06
    GMI μexp 1.984E-05 5.675E-06 1.195E-06 2.575E-07
    βexp 1.490E-01 1.550E-01 1.750E-01 1.850E-01

     | Show Table
    DownLoad: CSV

    Example 3. The matrices in Eq (1.1) are described as

    A=B=M+2N+100(n+1)2In,

    where M=tridiag(1,2.6,1) and N=tridiag(0.5,0,0.5). The right-hand side C=AX+XB with X=ones(n).

    The numerical results of the tested algorithms for Example 3 are listed in Table 6, and the corresponding error convergence curves are shown in Figure 3. Let P and Q be the tridiagonal parts of the matrices ATA and BBT, respectively. The experimentally optimal parameters contained in GI ((13)–(15) in [8]), AJGI (Algorithm 5 in [21]), PGI, and GMI algorithms are given in Table 5. Like [21], the relaxation parameters ω1 and ω2 in the AJGI algorithm are 0.5 and 3, respectively. From Table 6 it follows that all the algorithms are effective for this example. In addition, the APGI algorithm is the most efficient one among the six algorithms. Figure 3 shows their convergence performances for the case n=256, and the APGI algorithm has the remarkably best convergence result.

    Figure 3.  Convergence curves of different algorithms for Example 3.
    Table 5.  The experimental optimal parameters for Example 3.
    Algorithms 128 256 512 1024
    GI μexp 4.714E-02 4.723E-02 4.725E-02 4.726E-02
    AJGI μexp 2.400E-02 2.400E-02 2.300E-02 2.300E-02
    GMI μexp 8.800E-02 8.300E-02 8.700E-02 8.800E-02
    βexp 8.700E-01 8.700E-01 8.700E-01 8.700E-01
    PGI μexp 4.400E-01 4.200E-01 3.900E-01 3.900E-01

     | Show Table
    DownLoad: CSV
    Table 6.  Numerical results of different algorithms for Example 3.
    Algorithms 128 256 512 1024
    GI IT 398 397 398 399
    CPU 0.429 2.085 14.791 117.676
    RRN 9.954E-07 9.703E-07 9.822E-07 9.751E-07
    AJGI IT 180 183 185 185
    CPU 0.166 0.941 5.978 51.097
    RRN 9.129E-07 9.161E-07 8.603E-07 8.928E-07
    GMI IT 190 186 182 181
    CPU 0.211 0.799 6.713 42.489
    RRN 8.229E-07 7.694E-07 7.376E-07 6.119E-07
    PGI IT 96 95 95 109
    CPU 0.209 0.768 5.23 46.233
    RRN 8.448E-07 7.775E-07 9.507E-07 9.342E-07
    AGMI IT 51 50 48 47
    CPU 0.138 0.642 4.217 38.757
    RRN 7.780E-07 8.138E-07 9.664E-07 8.567E-07
    APGI IT 30 28 26 24
    CPU 0.093 0.437 2.802 24.721
    RRN 9.078E-07 9.643E-07 9.197E-07 8.501E-07

     | Show Table
    DownLoad: CSV

    In this paper, we provide two accelerated GI algorithms for solving Eq (1.1), which are the preconditioned gradient-based iteration (PGI) algorithm and the gradient-based momentum iteration (GMI) algorithm, respectively. Convergence analyses show that the proposed algorithms converge to the exact solution for any initial value with some assumptions. Moreover, the adaptive PGI and GMI algorithms are also established, and the adaptive parameters can be computed by minimizing the residual norms in the corresponding algorithms. Numerical experiments illustrate the excellent performances of our proposed algorithms. In addition, how to use the APGI and AGMI algorithms for solving other matrix equations will be investigated in our future work.

    The authors declare that they have not used Artificial Intelligence (AI) tools in the creation of this article.

    Huiling Wang: gave the algorithms proposed in the manuscript, provided the numerical results and wrote the original draft of the manuscript; Nian-Ci Wu and Yufeng Nie: gave the clear guidance on the proof of the theorem and polished the language of the entire manuscript. All authors have read and agreed to the published version of the manuscript.

    The work is supported by the National Natural Science Foundation of China (12201651), the Fundamental Research Funds for the Central Universities, South Central Minzu University (CZQ23004), and the Research Project Supported by Shanxi Scholarship Council of China(2023-117).

    The authors declare no conflicts of interest.



    [1] A. L. Andrew, Eigenvectors of certain matrices, Linear Algebra Appl., 7 (1973), 151–162. http://dx.doi.org/10.1016/0024-3795(73)90049-9 doi: 10.1016/0024-3795(73)90049-9
    [2] Z. Z. Bai, G. H. Golub, J. Y. Pan, Preconditioned Hermitian and skew-Hermitian splitting methods for non-Hermitian positive semidefinite linear systems, Numer. Math., 98 (2004), 1–32. http://dx.doi.org/10.1007/s00211-004-0521-1 doi: 10.1007/s00211-004-0521-1
    [3] Z. Z. Bai, On hermitian and skew-hermitian splitting iteration methods for continuous sylvester equations, J. Comput. Math., 29 (2011), 185–198. http://dx.doi.org/10.4208/jcm.1009-m3152 doi: 10.4208/jcm.1009-m3152
    [4] Z. Z. Bai, M. Benzi, F. Chen, Z. Q. Wang, Preconditioned MHSS iteration methods for a class of block two-by-two linear systems with applications to distributed control problems, IMA J. Numer. Anal., 33 (2013), 343–369. http://dx.doi.org/10.1093/imanum/drs001 doi: 10.1093/imanum/drs001
    [5] A. Bhaya, E. Kaszkurewicz, Steepest descent with momentum for quadratic functions is a version of the conjugate gradient method, Neural Networks, 17 (2004), 65–71. http://dx.doi.org/10.1016/S0893-6080(03)00170-9 doi: 10.1016/S0893-6080(03)00170-9
    [6] Z. B. Chen, X. S. Chen, Conjugate gradient-based iterative algorithm for solving generalized periodic coupled Sylvester matrix equation, J. Franklin I., 359 (2022), 9925–9951. http://dx.doi.org/10.1016/j.jfranklin.2022.09.049 doi: 10.1016/j.jfranklin.2022.09.049
    [7] M. Dehghan, A. Shirilord, The double-step scale splitting method for solving complex Sylvester matrix equation, Comp. Appl. Math., 38 (2019), 146. http://dx.doi.org/10.1007/s40314-019-0921-6 doi: 10.1007/s40314-019-0921-6
    [8] F. Ding, T. W. Chen, Gradient based iterative algorithms for solving a class of matrix equations, IEEE T. Automat. Contr., 50 (2005), 1216–1221. http://dx.doi.org/10.1109/TAC.2005.852558 doi: 10.1109/TAC.2005.852558
    [9] F. Ding, P. X. Liu, J. Ding, Iterative solutions of the generalized Sylvester matrix equations by using the hierarchical identification principle, Appl. Math. Comput., 197 (2008), 41–50. http://dx.doi.org/10.1016/j.amc.2007.07.040 doi: 10.1016/j.amc.2007.07.040
    [10] F. Ding, X. H. Wang, Q. J. Chen, Y. S. Xiao, Recursive least squares parameter estimation for a class of output nonlinear systems based on the model decompositions, Circuits Syst. Signal Process., 35 (2016), 3323–3338. http://dx.doi.org/10.1007/s00034-015-0190-6 doi: 10.1007/s00034-015-0190-6
    [11] X. Dong, X. H. Shao, H. L. Shen, A new SOR-like method for solving absolute value equations, Appl. Numer. Math., 156 (2020), 410–421. http://dx.doi.org/10.1016/j.apnum.2020.05.013 doi: 10.1016/j.apnum.2020.05.013
    [12] K. Du, C. C. Ruan, X. H. Sun, On the convergence of a randomized block coordinate descent algorithm for a matrix least squares problem, Appl. Math. Lett., 124 (2022), 107689. http://dx.doi.org/10.1016/j.aml.2021.107689 doi: 10.1016/j.aml.2021.107689
    [13] W. Fan, C. Gu, Z. Tian, Jacobi-gradient iterative algorithms for Sylvester matrix equations, 14th Conference of the International Linear Algebra Society, Shanghai, China, 2007, 16–20.
    [14] C. Q. Gu, H. Y. Xue, A shift-splitting hierarchical identification method for solving Lyapunov matrix equations, Linear Algebra Appl., 430 (2009), 1517–1530. http://dx.doi.org/10.1016/j.laa.2008.01.010 doi: 10.1016/j.laa.2008.01.010
    [15] B. H. Huang, W. Li, A modified SOR-like method for absolute value equations associated with second order cones, J. Comput. Appl. Math., 400 (2022), 113745. http://dx.doi.org/10.1016/j.cam.2021.113745 doi: 10.1016/j.cam.2021.113745
    [16] X. Li, H. F. Huo, A. L. Yang, Preconditioned HSS iteration method and its non-alternating variant for continuous Sylvester equations, Comput. Math. Appl., 75 (2018), 1095–1106. http://dx.doi.org/10.1016/j.camwa.2017.10.028 doi: 10.1016/j.camwa.2017.10.028
    [17] M. S. Mehany, Q. W. Wang, Three symmetrical systems of coupled Sylvester-like quaternion matrix equations, Symmetry, 14 (2022), 550. http://dx.doi.org/10.3390/sym14030550 doi: 10.3390/sym14030550
    [18] Q. Niu, X. Wang, L. Z. Lu, A relaxed gradient based algorithm for solving Sylvester equations, Asian J. Control, 13 (2011), 461–464. http://dx.doi.org/10.1002/asjc.328 doi: 10.1002/asjc.328
    [19] B. T. Polyak, Some methods of speeding up the convergence of iteration methods, Comp. Math. Math. Phys., 4 (1964), 1–17. http://dx.doi.org/10.1016/0041-5553(64)90137-5 doi: 10.1016/0041-5553(64)90137-5
    [20] S. G. Shafiei, M. Hajarian, An iterative method based on ADMM for solving generalized Sylvester matrix equations, J. Franklin I., 359 (2022), 8155–8170. http://dx.doi.org/10.1016/j.jfranklin.2022.07.049 doi: 10.1016/j.jfranklin.2022.07.049
    [21] Z. L. Tian, M. Y. Tian, C. Q. Gu, X. N. Hao, An accelerated Jacobi-gradient based iterative algorithm for solving Sylvester matrix equations, Filomat, 31 (2017), 2381–2390. http://dx.doi.org/10.2298/FIL1708381T doi: 10.2298/FIL1708381T
    [22] Z. L. Tian, Y. D. Wang, Y. H. Dong, S. Y. Wang, New results of the IO iteration algorithm for solving Sylvester matrix equation, J. Franklin I., 359 (2022), 8201–8217. http://dx.doi.org/10.1016/j.jfranklin.2022.08.018 doi: 10.1016/j.jfranklin.2022.08.018
    [23] Q. W. Wang, R. Y. Lv, Y. Zhang, The least-squares solution with the least norm to a system of tensor equations over the quaternion algebra, Linear Multilinear A., 70 (2022), 1942–1962. http://dx.doi.org/10.1080/03081087.2020.1779172 doi: 10.1080/03081087.2020.1779172
    [24] Q. W. Wang, X. Wang, A system of coupled two-sided Sylvester-type tensor equations over the quaternion algebra, Taiwanese J. Math., 24 (2020), 1399–1416. http://dx.doi.org/10.11650/tjm/200504 doi: 10.11650/tjm/200504
    [25] Y. J. Xie, C. F. Ma, The accelerated gradient based iterative algorithm for solving a class of generalized Sylvester-transpose matrix equation, Appl. Math. Comput., 273 (2016), 1257–1269. http://dx.doi.org/10.1016/j.amc.2015.07.022 doi: 10.1016/j.amc.2015.07.022
    [26] A. L. Yang, Y. Cao, Y. J. Wu, Minimum residual Hermitian and skew-Hermitian splitting iteration method for non Hermitian positive definite linear systems, BIT Numer. Math., 59 (2019), 299–319. http://dx.doi.org/10.1007/s10543-018-0729-6 doi: 10.1007/s10543-018-0729-6
    [27] A. L. Yang, On the convergence of the minimum residual HSS iteration method, Appl. Math. Lett., 94 (2019), 210–216. http://dx.doi.org/10.1016/j.aml.2019.02.031 doi: 10.1016/j.aml.2019.02.031
    [28] J. F. Yin, Q. Y. Dou, Generalized preconditioned Hermitian and skew-Hermitian splitting methods for non-Hermitian positive-definite linear systems, J. Comput. Math., 30 (2012), 404–417. http://dx.doi.org/10.4208/jcm.1201-m3209 doi: 10.4208/jcm.1201-m3209
    [29] X. F. Zhang, Q. W. Wang, Developing iterative algorithms to solve Sylvester tensor equations, Appl. Math. Comput., 409 (2021), 126403. http://dx.doi.org/10.1016/j.amc.2021.126403 doi: 10.1016/j.amc.2021.126403
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(461) PDF downloads(53) Cited by(0)

Figures and Tables

Figures(3)  /  Tables(6)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog