1.
Introduction
This manuscript is about the study of numerical methods for large sparse nonlinear equations described by
where F:D⊂Cn→Cn is a continuously differentiable function. We assume that the Jacobian matrix F′(u) of the function F can be split as follows:
where W(u) and T(u) are both real symmetric matrices, and i=√−1 stands for the imaginary unit. These nonlinear systems can be seen in the applications of physics, scientific computing and engineering [1,2,3].
To solve systems that can be described by (1.1), one of the most commonly used methods is Newton's method, i.e.,
However, when the problem size becomes large, the cost of each step of the traditional Newton method is expensive. It is well known that the idea of the inexact Newton method [4] overcomes this difficulty and improves the iteration efficiency. In each step of the inexact Newton method, only the inexact solution of Newton's equation (1.3) needs to be obtained. In this sense, we trade a bit of precision for better efficiency. Inexact Newton method have been widely used in recent decades.
Algorithm 1.1. Inexact Newton method
1. Set an initial guess u0.
2. Set some ηk∈[0,1). For k=0,1,⋯ we solve
to find some sk which satisfies
3. Set uk+1=uk+sk.
Inexact Newton methods can be viewed as inner-outer iterative methods. The outer part is Newton's method, which is employed to generate the sequence {xk}. And, the inner iterations are linear iterative methods used inside Newton's method to approximately solve Newton's equations. This kind of inner-outer iteration scheme has greatly improved the computational efficiency of the traditional Newton method. In the past few decades, a number of linear iterations, such as the classical splitting methods [5,6,7] and the modern Krylov subspace methods [9], have been utilized inside the inexact Newton methods [10,11,12]. For some other recent research papers in the area, see [13,14,15].
In [16], Darvishi and Barati construct a modified Newton method, which requires only one more evaluation of F per step than the Newton method, while it has three R-orders of convergence at least. Compared with Newton's method, the modified Newton method improves the convergence speed and convergence order. It has also been used as the mainstream outer iteration of the previously mentioned iterative scheme in recent years; for examples, see [17,18,19,20,21,22,23]. The modified Newton method is as follows:
In this manuscript, we shall use the Euler-extrapolated Hermitian and skew-Hermitian splitting (EHS) method as the inner solver of the modified Newton method to establish our new method. The rest of this manuscript is roughly structured as follows. In Section 2, we will review the EHS method and build the modified Newton-EHS (MN-EHS) method. In Sections 3 and 4 we analyze the local and semilocal convergence properties of the MN-EHS method under the Hölder condition, respectively. In Section 5, there are two numerical examples comparing our method with some other methods in the earlier literature to reveal the computational efficiencies of our new iteration scheme. Finally, a short summary is given in Section 6.
2.
The MN-EHS method
In [24], Li and Ma proposed the EHS method. This method was constructed to appropriately solve complex symmetric linear problems described by Au=b. They proposed the Euler-extrapolated technique in that paper, and it is used as an efficient solver in large sparse linear systems. In this part, the EHS method [24] will be reviewed.
Let us consider the complex symmetric linear problems described by
where W,T are both symmetric, positive semi-definite matrices, b∈Cn is a known vector and i=√−1 is the imaginary unit.
The EHS iterative method can be simply represented by the following formula:
where θ∈[0,π2].
Notice that e−iθ=cos(θ)−isin(θ) according to Euler's formula, and u0∈C is the initial guess.
Remark 2.1. There are some restrictions on the selection of the parameter θ. See the convergence theorems Theorem 3.1 and Theorem 3.2 in [24]. For more studies about Euler-extrapolated techniques, see [25,26,27].
Now, we intend to establish the MN-EHS method. Suppose that
where W(u),T(u)∈Rn×n, which can be calculated by using
where F′(u)∗ denotes the conjugate transpose matrix of F′(u).
Algorithm 2.1. MN-EHS method
1. Let the initial guess u0 be given. Set a nonnegative parameter θ∈[0,π2], a positive constant tol and two positive integer sequences {lk}∞k=0, {mk}∞k=0.
2. For k=0,1,⋯, until ‖F(uk)‖≤tol‖F(u0)‖, do the following:
2.1. Set dk,0=hk,0:=0.
2.2. For l=0,1,⋯,lk−1, apply the EHS method to the linear system described by (1.5):
obtain dk,lk such that
2.3. Set vk=uk+dk,lk.
2.4. Compute F(vk).
2.5. For m=0,1,2,⋯,mk−1, apply the EHS method to the linear system described by (1.5):
obtain hk,mk such that
2.6. Set uk+1=vk+hk,mk.
3. End.
Remark 2.2. For the needs of subsequent study and derivation, we give the expressions of dk,lk and hk,mk:
where
After straightforward derivation, we have
Define
Then, the Jacobian matrix F′(u)=B(θ;u)−C(θ;u), and
Therefore, we equivalently represent the MN-EHS method as follows:
3.
Local convergence property of MN-EHS method
In this part, our main task is to give the derivations of the convergence properties. Let us begin with the local convergence. In this section we will analyze the local convergence property under the Hölder continuous condition, similar to that in [28]. First of all, we give, without proof, the following Banach lemma.
Lemma 3.1. (Banach Lemma) Let A,B in Cn×n satisfy ‖I−BA‖<1; then, the matrices A,B are nonsingular. Moreover,
and
Suppose that F:D⊂Cn→Cn is a G-differentiable function on N0⊂D, where N0 is the convex neighborhood of the point u∗ which satisfies F(u∗)=0. Its Jacobian matrix F′(u) is continuous, positive definite and complex symmetric. For any x∈D, suppose that F′(u)=W(u)+iT(u) is the splitting of the Jacobian matrix F′(u). N(u∗,r) denotes an open ball centered at u∗ with radius r>0.
Assumption 3.1. For arbitrary u∈N(u∗,r)⊂N0, assume that the following conditions hold.
(1) (The Bounded Condition) There are positive constants β and γ such that
(2) (The Hölder Condition) For some p∈(0,1], there exist nonnegative constants Hw and Ht such that
Lemma 3.2. Under Assumption 1, for any u,v∈N(u∗,r), if r∈(0,1(γH)1p), then F′(u)−1 exists. And, the following inequalities hold:
where S(u):=γ1−γH‖u−u∗‖p, H:=Hw+Ht.
Proof. According to the Hölder condition,
Since r∈(0,1(γH)1p), we have
Then, according to ‖F′(u∗)−1‖≤γ and Lemma 3.1, F′(u)−1 exists and
By
we get
As for the last inequality, since
it follows that
□
In the remainder of this article, we use the symbol ⌊⌋ to represent the smallest integer that is no less than the corresponding real number.
Theorem 3.1. Under the conditions of Assumption 3.1 and Lemma 3.2, let r∈(0,r0), where r0=min{r1,r2,r3}, and
The constant ν=min{l∗,m∗}, and ν satisfies ν>⌊−ln(2βγ)ln((τ+1)θ)⌋, where l∗=lim infk→∞lk,m∗=lim infk→∞mk, τ∈(0,1−χχ) is a prescribed positive constant and χ≡χ(θ;u0)=‖M(θ;u0)‖<1.
Then, for any initial guess u0∈N(u∗,r) and any positive integer sequences {lk}∞k=0 and {mk}∞k=0, the solution sequence {uk}∞k=0 of the MN-EHS method represents convergence to the exact solution u∗. In addition, we have
where
Proof. Since ‖M(θ;u∗)‖≤δ(θ;u∗)<1,
Then
According to the Hölder condition, we have
Based on Lemma 3.1,
Since
then
Here, r<r1 implies that 4γH‖u−u∗‖p1−2γH‖u−u∗‖p<τχ.
Hence,
Furthermore, we have
and, similarly, we get
Then, by induction, we can prove that {uk}∞k=0⊂N(u∗,r). First, when k=0, ‖u0−u∗‖<r<r0 and
then, u1∈N(u∗,r) since u0∈N(u∗,r).
Now, by induction, when k=0, suppose that un∈N(u∗,r); then, we have
which implies that un+1∈N(u∗,r) for k=n+1. Moreover, un+1→u∗ as n→∞. □
4.
Semilocal convergence property of MN-EHS method
The convergence property discussed in the previous section is the local convergence property. The iteration is convergent on the premise that the initial guess of the iteration is located in an open ball of the exact solution. In practical calculation, it is hoped that the existence of the solution of the nonlinear system (1.1), as well as the convergence of iteration, can be ascertained directly from some conditions of the initial guess. Here, we put forward the semilocal convergence theorem of the MN-EHS method.
Assumption 4.1. For any x0∈N0, assume that the following conditions hold.
(1) (The Bounded Condition) There are two positive constants β and γ such that
(2) (The Hölder Condition) There are two nonnegative constants Hw and Ht such that, for any u,v∈N(u0,r)⊂N0, the following inequalities are satisfied:
Lemma 4.1. For any u,v∈N(u0,r), if r∈(0,(1γH)1p), then F′(u)−1 exists. And, the following inequalities hold:
where H:=Hw+Ht.
Proof. It will be omitted because the proof of Lemma 4.1 is similar to that of Lemma 3.2. □
Before giving the semilocal convergence theorem, we need some preparations; we construct two sequences and give some lemmas.
First, we give two important functions:
where a,b,c and d are positive constants which satisfy
Define the sequences {tk} and {sk} by
Lemma 4.2. λ(t) is decreasing in [0,(ab)1p) but increasing in [(ab)1p,+∞). Moreover, if
then λ(t)=0 has two solutions t∗ and t∗∗ in (0,+∞), which satisfy
Proof. See Lemma 2.1 in [18]. □
Lemma 4.3. Suppose that the sequences {tk},{sk} are generated by the formula (4.6). And, t∗ is the smaller nonnegative solution of φ(t)=0. Then, the sequences {tk} and {sk} increase, converge to t∗ and satisfy the following inequalities:
Proof. See Lemma 2.2 in [18]. □
The following theorem is the semilocal convergence theorem. Take a=(1+η)Hγ,b=(1−η),c=(1+η)γδ and d=Hγ in (4.4).
Theorem 4.1. Set r=min(r1,r2) with
where the constant ν=min{l∗,m∗} satisfies ν>⌊−ln(2βγ)ln((τ+1)θ)⌋; also, l∗=lim infk→∞lk and m∗=lim infk→∞mk. τ∈(0,1−χχ) is a prescribed positive constant and χ≡χ(θ;u0)=‖M(θ;u0)‖<1.
Under the assumptions of Lemma 4.1, if
then the iteration sequence {uk}∞k=0 generated by the MN-EHS method is well defined and converges to u∗, which satisfies that F(u∗)=0.
Proof. The following formulas are true and can be proved by induction:
We have
Here, we use an inequality introduced by Shen and Li in [29]. For any k>1, we have
since
Now, by induction, for any k,
Since uk−1,vk−1∈N(u0,r), we have
Hence,
Similarly, we have
Consequently,
Now, the formulas given by (4.7) have been proved by induction. For the reason that the sequences {tk},{sk} converge to t∗ and
the sequence {uk} also converges to u∗. Since ‖M(θ;u∗)‖<1, we have that F(u∗)=0. □
5.
Numerical examples
Next, we shall represent the validity of our new method via numerical examples. We chose some methods given in previous paper, i.e., the modified Newton-PMHSS method [19] (MN-PMHSS) and modified Newton-GSOR method [23] (MN-GSOR), for comparison with the MN-EHS method. In our computation, the CPU running time, which is denoted by "CPU time" was recorded by implementing the command "tic-toc".
Regarding the computer programming, all of the numerical results in the following numerical examples were performed on a laptop, and the software was MATLAB version R2017b. This laptop had an AMD Ryzen7-4800H 2.90 GHz and 16.00 GB RAM. The number of outer iteration steps is denoted by "Outer IT", and that for the inner iteration steps is denoted by "Inner IT". A thorny problem is the selection of parameters in the iterations; we used the experimental optimal parameters in this study. That is, when the parameter minimizes the corresponding iteration steps and errors, it was chosen. All of the important data are listed in tables.
Example 5.1 What follows is a group of partial differential equations which can be converted to a nonlinear system:
where Ω=(0,1)×(0,1) and the boundary of Ω is ∂Ω. As for the constant κ, it was used to measure the magnitudes of the reaction term; also, κ was set as positive. We set the values of the parameters α1=α2=1 and β1=β2=2. This problem can be converted to a nonlinear system, as discussed in this manuscript, by using the central finite-difference scheme. The grid was set as equidistant, and the step width was set as Δt=h=1/(N+1).
Here is the form of this nonlinear system:
where
Notice that AN=tridiag(−1,2,−1) is a tridiagonal matrix; ⊗ is the Kronecker product symbol.
It is obvious that u∗=0 is a solution of (5.1). And, the Jacobian matrix of F(u) can be easily worked out
In our experiment, the initial guess was chosen to be u0=1. The stopping term for the outer iteration was taken as
And, ηk=˜ηk=η=0.1 is the tolerance of the inner iterations. Table 1 gives the optimal values α or θ for the three methods.
See Tables 2, 3 and 4; the experimental data when N=30,60,90 are shown to compare our MN-EHS method with the MN-PMHSS method and MN-GSOR method. In order to show how the parameter is chosen, we represent Figure 1, which shows how the inner iteration steps of the MN-EHS method changes when the parameter varies. We employed the parameters that minimize the inner iteration steps as the optimal parameters. When the number of inner iteration steps are the same for different parameters, the one with a smaller error will be chosen.
According to the results in Tables 2, 3 and 4, the CPU time and the number of iterations in the MN-EHS method are typically shorter and smaller, respectively, than the MN-PMHSS method and the MN-GSOR method when the constant κ and the problem size vary. This simply indicates that the MN-EHS method is more effective than the other two in this example.
Finally, the steps of iteration of the MN-EHS method when the problem size varies are shown in Figures 2, 3 and 4. Broadly, the steps of the outer iterations exhibited almost no change, and the steps of the inner iterations increased when the problem size varied, but the changes were not very intense.
Example 5.2 Consider the nonlinear Helmholtz equation
where σ1 and σ2 are real coefficients. Notice that the solution of this equation should satisfy the Dirichlet boundary value condition on D=[0,1]×[0,1]. After discretization on the mesh size h=1/(N+1), the nonlinear system has the form
where
with
and BN=1h2tridiag(−1,2,−1)∈RN×N is a tridiagonal matrix.
In this numerical experiment, we applied σ1=103 and σ2=104. The initial guess was taken as x0=0; here, 0 is a zero vector. The tolerance of the inner iterations was set as ηk=~ηk=η=0.1, i.e., the same as the first example. While a little different from the first example, the stopping criteria for the outer iterations was set as
Table 5 lists the experimental parameters we applied.
From Table 6, which displays the numerical results for N=30,60,90, the MN-EHS method still outperformed the other two methods in this example.
6.
Conclusions
The main aim of this article was to present an iterative method for solving large-scale sparse nonlinear equations, which typically have complex symmetric Jacobian matrices. Finding solutions for this type of nonlinear equation system is very important in practical applications of a large number of scientific calculations. This paper presents the construction of a new MN-EHS method and gives derivations of the convergence properties. Two academic test examples which arise from differential equations are given. In the form of tables and data, we have compared the MN-EHS method with some methods in existing literature; the results indicate that our proposed method performs better than existing methods on these types of problems.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (Grant no. 12271479, Research on some efficient and fast algorithms for complex nonlinear problems).
Conflict of interest
The authors declare that they have no conflict of interest.