1.
Introduction
Solving nonlinear equations in Banach space is an important task in the field of applied science. All sorts of questions can be turned into
Here, G:Φ⊆B1→B2 is a nonlinear sufficiently differentiable operator on an upper convex subset of B1, where B1 and B2 are Banach spaces. For this kind of nonlinear Eq (1.1), it is difficult to solve it analytically. Moreover, in most practical problems, it is not necessary to require the exact solution of the equation, but only the approximate value, and the error of the approximate value and the exact solution should be limited to the acceptable range of the practical problem. This approximation can be obtained by numerical iteration.
The fixed-point iteration method is still the main numerical method to solve the nonlinear equation. One of the most famous iteration methods is Newton's method [1], whose iteration scheme is
where Δ(k)=G′(s(k))−1 for k=0,1,2,3⋯. Because of its simple structure, small amount of computation, and fast convergence speed, Newton's method is still the most important iterative method for solving nonlinear equations in concrete calculation and application. However, its disadvantages are also obvious, such as the convergence speed is only second order. Therefore, in order to meet the need of high precision, scholars have proposed many high order convergence iterative methods [2,3,4] on the basis of Newton's method. Cordero et al. proposed an iterative method [5] of sixth-order convergence. The iteration format of this sixth-order method is
In the iterative method (1.3), three function values G(s(k)), G(r(k)), G(t(k)) and two Jacobian matrices G′(s(k)), G′(t(k)) need to be calculated and three LU factorizations (LU factorization is a type of matrix factorization that can decompose a matrix into the product of a lower trigonometric matrix and an upper trigonometric matrix) need to be performed.
Zhanlav et al. also proposed an iterative method [6] with sixth-order convergence, which is in the form of
where Πk=(I−4Mk)−1(I−72Mk), Ψk=(I+Mk)−1(I+2Mk−12M2k), Mk=I−Δ(k)G′(t(k)). In the iterative method (1.4), it is also necessary to compute three function values G(s(k)), G(t(k)), G(r(k)) and two Jacobian matrices G′(s(k)), G′(t(k)) and to perform three LU factorizations.
Cordero et al. also proposed an iterative method [7] of sixth-order convergence, whose iteration format is
where Δ(k)=G′(s(k))−1. Compared with the iterative method (1.3) and (1.4), which are also sixth-order converging, the iterative method (1.5) needs to compute three function values G(s(k)), G(t(k)), G(r(k)) and two Jacobian matrices G′(s(k)), G′(t(k)), and only needs to perform two LU factorizations. The computational cost of iterative methods (1.5) is lower than that of iterative methods (1.3) and (1.4).
At present, the most commonly used methods to prove semi-local convergence mainly include the majorizing sequence method [8,9] and recursion method [10,11]. In fact, both methods were proposed by Kantorovich [12], and their main idea was to prove them by induction. In the process of proving the semi-local convergence of the iterative method of the system of equations, we usually study the iterative method in one-dimensional space because the iterative method of solving the nonlinear equation in one-dimensional real number space can be generalized to Banach space [13].
This paper mainly uses the recursion method to analyze the semi-local convergence of Cordero's sixth-order convergence iterative method (1.5). In Cordero's proof of sixth-order convergence, the operator G is usually required to be a sufficiently differentiable function in the neighborhood of the solution to guarantee the continuity of the sixth-order derivative used to prove the convergence of the iterative method. Let's think about this function
where G:Φ⊂R→R and Φ=[−1,2]. The root of this function is denoted by α, so we can observe that α=1 is the root of G(s) and G‴(s)=24slns2+300s2−68s. It is obvious that G‴(s) is unbounded on Φ, so the previous analysis does not guarantee the convergence of method (1.5). Therefore, in order to avoid the use of higher derivatives, we apply Lipschitz conditions only to first-order Fréchet derivatives to prove the semi-local convergence of the iterative method (1.5).
This paper is divided into six parts. In Section 2, we give three scalar functions and three auxiliary sequences to prove semi-local convergence, and we analyze the properties of the auxiliary sequences and scalar functions. In Section 3, the recursion relation used to prove the semi-local convergence of iterative method (1.5) is given. In Section 4, the semi-local convergence of method (1.5) and the uniqueness of the solution are both proven. The numerical example and results are shown in Sections 5 and 6, respectively.
2.
Preparatory knowledge
Let G:Φ⊆B1→B2 be a differentiable nonlinear Fréchet operator in the open set Φ and let B1 and B2 be Banach spaces. Suppose the inverse Δ0∈L(B2,B1) of the Jacobian matrix of the first iteration in the iterative system (1.5) and s0 satisfies s0∈Φ, where L(B2,B1) is the set of linear operators from B2 to B1.
In addition, we use the Kantorovich condition [12] to obtain the semi-local convergence result of this iterative method (1.5).
(C1)∥Δ0∥≤β,
(C2)∥Δ0G(s0)∥≤η,
(C3)∥G′(s)−G′(t)∥≤K∥s−t∥,
where K,β,η are nonnegative real numbers. For simplicity of form, we denote η0=η,λ0=Kβη,μ0=q(λ0)p(λ0), let λ0<σ and σ≈0.603<1 be the smallest positive root of the scalar function sh(s)−1, and define the sequences
where n≥0. The scalar functions are
This is the key to study the semi-local convergence of iterative methods. The following is the interrelation between scalar functions defined by (2.4)–(2.6) and sequences defined by (2.1)–(2.3) by some lemmas, which we will use later in the derivation of recursive relations.
Lemma 2.1. The functions h(s),p(s), and q(s) are defined by (2.4)–(2.6), and some of their properties are as follows:
(a) h(s), p(s), and q(s) are increasing, where p(s)>1 and h(s)>1 for 0<s<σ,
(b) p(λ0)q(λ0)<1 for λ0<0.359,
(c) p(λ0)2q(λ0)<1 for λ0<0.297.
Proof: Using the definition of increasing function, it is easy to prove (a). A numerical calculation is then performed to prove (b),(c). As p(λ0)2q(λ0)<1, then by constructing λn, it is a decreasing sequence. So, λn<λ0≤0.297 for all n≥1.
Lemma 2.2. Let p(s), h(s), and q(s) be the auxiliary functions defined by (2.4)–(2.6), and σ is the smallest positive root of the scalar function sh(s)−1. If
then,
(a) p(λ0)>1, μn<1 (n≥0),
(b) the sequence {λn}, {μn}, and {ηn} are decreasing, where λn<0.297 for n≥0,
(c) h(λn)λn<1, p(λn)μn<1(n≥0).
Proof: (a) From Lemma 2.1 and (2.7), we can see that p(λ0)>1 is true and μ0<1, so it is true when n=0. When n=1, the same reason μ1<1 is true, it can be obtained by mathematical induction that μn<1 is true.
(b) From the definition of the sequence (2.1)–(2.3) and (a), we can obtain μn<1, so ηn+1<ηn, and {ηn} is a decreasing sequence. By Lemma 2.1, when n=0, p(λ0)2q(λ0)<1, so λ1<λ0. By mathematical induction, {λn} is a decreasing sequence.In the same way, μ1<μ0 and {μn} is also a decreasing sequence.
(c) From Lemma 2.1 and the above results, we can see that h(λ1)λ1<h(λ0)λ0<1 and p(λ1)μ1<h(λ0)μ0<1 are true and (c) is established by induction.
3.
Recursive relation
The required recursion relations and auxiliary functions are defined, and we begin to analyze the iterative method (1.5), which serves as the basis for later semi-local convergence analysis. We define B(s,r)={t∈B1:∥t−s∥<r},¯B(s,r)={t∈B1:∥t−s∥≤r}. Under the assumption (C1)–(C3) in the previous section, the recursion relation that defines the iterative method in (1.5) is given below.
We expand the Taylor series of t0 at G estimated near s0 to
From the first-step of the iterative method (1.5), the term G(s0)+G′(s0)(t0−s0) is equal to zero. Using variable substitution s=s0+v(t0−s0), we get
when n=0. It is known that Δ0 exists from the hypothesis (C1)–(C3), and it shows that t0 also exists, thus, there is
This shows that t0∈B(s0,Rη)
Take the norm (3.2) and apply the Lipschitz condition [14]. We obtain
so that
Similarly, we can get r0−t0.
so that
Applying Banach's lemma[15], it follows that
Thus, G′(t0)−1 exists and
The Taylor series expansion of G around t0 evaluated in r0 is
Taking norms and applying Lipschitz condition, we obtain
Thus,
Therefore,
where λ0=Kβη and h(s)=1+s2+s22+s3(1+s)28(1−s).
Apply the Banach lemma again, one has
then, as far as λ0h(λ0)<1 (by taking λ0<σ), Banach's lemma guarantees that (Δ0G′(s1))−1=Δ1Δ−10 exists and
being p(s)=11−sh(s).
Repeating the extrapolation process above, we can get the recurrence relationship given by the following lemma.
Lemma 3.1. The following corollary is proved by induction when n≥1:
(In)∥Δn∥≤p(λn−1)∥Δn−1∥
(IIn)∥tn−sn∥=∥ΔnG(sn)∥≤ηn
(IIIn)K∥Δn∥∥tn−sn∥≤λn
(IVn)∥sn−sn−1∥≤h(λn−1)ηn−1
Proof: Starting from n=1, (I1) has been proved in (3.14).
For (II1), take the Taylor expansion of G(s1) near t0, and we get
Taking the norm of G(s1),
When one
then,
By applying (I1), we can get
Let
where μ0=q(λ0)p(λ0) and
(III1): Use (I1) and (II1) for n=1 to prove
(IV1): This has been proven in (3.11), when n=1.
4.
Semi-local convergence analysis
In this section, we give the semi-local convergence theorem of the iterative method of sixth-order convergence. It is first necessary to prove that the sequence {sn} is a Cauchy sequence, because this guarantees that the sequence {sn} is convergent in the Banach space. According to the above analysis of recursive sequences {λn},{μn} and auxiliary functions h(x),p(x),q(x), we give the following preliminary results:
Theorem 4.1. Let G:Φ⊆B1→B2 be a quadratic differentiable Fréchet nonlinear operator on the open set Φ, where B1 and B2 are Banach spaces. Let s0∈Φ and Δ0=[G′(s0)]−1 exist, and the condition (C1)−(C3) is satisfied. Let λ0=Kβη and λ0<σ and define ηn+1=μnηn, μn+1=q(λn+1)p(λn+1), λ0<σ, and p(λ0)μ0<1, where σ is the smallest positive root of the scalar function sh(s)−1. If Be(s0,Rη)={s∈X:∥s−s0∥<Rη}⊂Φ and R=h(λ0)1−q(λ0)p(λ0), then the iterated sequence s0 defined at (1.5) converges from the initial point s0 to the solution s∗ of G(x)=0. In this case, the iterated sequences {sn} and {tn} are included in Be(s0,Rη) and s∗∈B(s0,Rη), where s∗ is the unique solution of the equation G(x)=0 in Bn(s0,2Kβ−Rη)∩Φ.
Proof: According to Lemma 2.1, we can write
Thus,
According to Lemmas 2.1 and 2.2, the functions p(s) and q(s) are increasing. So, we express sn+1−s0 in terms of partial sums of geometric series,
Therefore, when p(λ0)q(λ0)<1 of Lemma 2.1 holds, we can conclude that {sn} all belong to ¯Be(s0,Rη). From Lemmas 2.1 and 2.2, we know that p(s), q(s) and h(s) increase and {λn} decreases, and then we can show that {sn} is a Cauchy sequence.
So, {sn} is a convergent Cauchy sequence. Therefore, there is s∗, such that limn→∞sn=s∗. In (4.3), let n=0,m→∞, and we get ∥s∗−s0∥≤Rη, which shows that ¯Be(s∗,Rη).
Finally, it is proven that we know the uniqueness of s∗ in Bn(s0,2Kβ−Rη)⋂Φ.
so ¯Be(s0,Rη)⊂Bn(s0,2Kβ−Rη)∩Φ. Below, we assume that t∗ is another solution of G(s)=0 in Bn(s0,2Kβ−Rη)⋂Φ and prove that s∗=t∗. Let's first take the Taylor expansion of G around s∗,
so that
We need to prove that the operator ∫10(F′(x∗+t(y∗−x∗)) is invertible, thus guaranteeing that y∗−x∗=0. Then, applying hypothesis (C3),
It follows from Banach's lemma that the operator ∫10(F′(x∗+t(y∗−x∗)) is invertible and ∫10(F′(x∗+t(y∗−x∗))∈L(B1,B2). The proof is completed by estimating 0=G(t∗)−G(s∗)=(t∗−s∗)∫10(F′(x∗+t(y∗−x∗)) to obtain t∗=s∗.
5.
Numerical experiments
In this section, we will use the iterative method (1.5) to solve nonlinear systems, showing that the recursion relationship we derive is reasonable. In addition, we use the iterative method (1.5) to solve practical chemical problems to demonstrate its applicability.
Problem 1. Nonlinear integral equations appear in many branches of mathematical physics, such as fracture mechanics, hythermoelasticity, fluid mechanics, and so on. In this section, we introduce the nonlinear integral equation of Hammerstein type[16], which is a special form of Urysohn type Volterra integral equation, and then we use the obtained results to solve the Hammerstein type integral equation to prove the applicability of the theoretical results. The format of the Hammerstein equation is as follows:
where s∈C(0,1),x∈[0,1],y∈[0,1],with the kernel H as
Equation (5.1) is solved by converting (5.1) into nonlinear equations through the discretization process. Next, GaussLegendre quadrature is used to approximate the integral in (5.1),
with yi and δi serving as the Gauss-Legendre polynomial's nodes and weights, respectively. Using the system of nonlinear equations, we estimate (5.1) after denoting the approximation of si,i=1,2,...,m as s(yi), where si approximated is
where
One way to rewrite the system is
where G′ is the Fréchet derivative of G, a nonlinear operator in L(RL,RL), and RL is the Banach space. We shall apply it to solve the nonlinear systems in accordance with (1.5).
Using the infinite norm while taking s0=(1.6,1.6,1.6,1.6,1.6,1.6,1.6)T,L=7, we can get
The above results satisfy the condition of semi-local convergence, so we apply this method to the system. In addition, the existence of the solution of s0 in Be(s0,6.3578) and uniqueness in Bn(s0,6.6419) are guaranteed by Theorem 4.1. In Table 1, we give the existence radius Re and uniqueness radius Rn when the initial estimator s0 with equal components takes different values. At the same time, we note that when s0i>1.7,i=1,2,...,7, the iterative method does not meet the convergence condition, so its convergence cannot be guaranteed.
When we use the iterative method (1.5) to solve Eq (5.2), the exact solution we get is
In Table 1, the values of relevant parameters in the conditions are given when different initial values are taken, and the existence radius Re and uniqueness radius Rn are obtained when different initial values are taken. Table 2 shows the errors and function values corresponding to different initial values, and proves that the iterative method (1.5) is convergent of sixth-order. The results obtained in Tables 1 and 2 are similar. We can converge to a unique solution under the Kantorovich condition [12] by choosing different initial values, and the closer the initial value is to the root, the lower the error estimate. The proof of semi-local convergence, which guarantees the existence and uniqueness of the solution under certain assumptions, is especially valuable in the process where the existence of the solution cannot be proven.
Problem 2. The gas equation of the state problem is one of the most important problems in solving practical chemistry problems, and we apply the iterative method (1.5) to this problem. First, give the van der Waals equation
where a=4.17atm⋅L/mol2, b=0.0371L/mol. Let's consider the pressure of 945.36kPa(9.33atm), the temperature of 300.2K, and the nitrogen of 2mol, and then find the volume of the container. Finally, by substituting the data into Eq (5.8), we can get
Taking s0=1.1 and the infinity norm, we get
Therefore, the method satisfies the convergence condition, the solution exists in Be(x0,3.3975), and the uniqueness domain is Bn(x0,7.2177). When the initial value satisfies the Kantorovich condition, the initial value in this range is taken to solve the nonlinear system. Using iterative method (1.5) to solve system (5.8) gives the root s∗=1.60917. A similar result can be obtained in Table 3; that is, under the Kantorovich condition, convergence to a unique solution can be achieved by selecting different initial values. The closer the initial value is to the root, the smaller the error estimate.
6.
Conclusions
In this paper, the semi-local convergence of Cordero's sixth-order iterative method (1.5) was proved by the recursive method. In order to study the semi-local convergence of Cordero's iterative method, first, we studied the properties of auxiliary sequences ηn,λn,μn and scalar functions h(s),p(s),q(s). Second, the neighborhood B(s0,R) centered on the initial point was given, and then it was proved that the iterative sequence converges to s∗∈¯B(s0,R), where s∗ satisfies G(s∗)=0, and the radius of convergence R was obtained, thus proving the existence of a solution. Finally, the uniqueness of a solution was proved by using Banach's lemma. In the whole process of proving semi-local convergence, the Lipschitz condition of the first-order Fréchet derivative was used to prove the semi-local convergence of the Cordero's iterative method. The correctness of the theory was proved by numerical experiments.
Use of AI tools declaration
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
Acknowledgments
This research was supported by the National Natural Science Foundation of China (No. 61976027), the Natural Science Foundation of Liaoning Province (Nos. 2022-MS-371, 2023-MS-296), Educational Commission Foundation of Liaoning Province of China (Nos. LJKMZ20221492, LJKMZ20221498), and the Key Project of Bohai University (No. 0522xn078).
Conflict of interest
The authors declare no conflict of interest.