Perturbations of minimizing movements and curves of maximal slope

Antonio Tribuzio; Antonio Tribuzio

doi:10.3934/nhm.2018019

Networks and Heterogeneous Media

2018, Volume 13, Issue 3: 423-448. doi: 10.3934/nhm.2018019

Previous Article Next Article

Perturbations of minimizing movements and curves of maximal slope

Antonio Tribuzio

Via della Ricerca Scientifica 1, Rome, 00133, Italy

Received: 01 October 2017 Revised: 01 April 2018
47J35, 47J30, 35B27, 35K90, 49J05

We modify the De Giorgi's minimizing movements scheme for a functional $φ$, by perturbing the dissipation term, and find a condition on the perturbations which ensures the convergence of the scheme to an absolutely continuous perturbed minimizing movement. The perturbations produce a variation of the metric derivative of the minimizing movement. This process is formalized by the introduction of the notion of curve of maximal slope for $φ$ with a given rate. We show that if we relax the condition on the perturbations we may have many different meaningful effects; in particular, some perturbed minimizing movements may explore different potential wells.

Keywords:

Citation: Antonio Tribuzio. Perturbations of minimizing movements and curves of maximal slope[J]. Networks and Heterogeneous Media, 2018, 13(3): 423-448. doi: 10.3934/nhm.2018019

Related Papers:

[1]	Saif Ur Rehman, Iqra Shamas, Shamoona Jabeen, Hassen Aydi, Manuel De La Sen . A novel approach of multi-valued contraction results on cone metric spaces with an application. AIMS Mathematics, 2023, 8(5): 12540-12558. doi: 10.3934/math.2023630
[2]	Dong Ji, Yao Yu, Chaobo Li . Fixed point and endpoint theorems of multivalued mappings in convex $ b $-metric spaces with an application. AIMS Mathematics, 2024, 9(3): 7589-7609. doi: 10.3934/math.2024368
[3]	Yan Han, Shaoyuan Xu, Jin Chen, Huijuan Yang . Fixed point theorems for $ b $-generalized contractive mappings with weak continuity conditions. AIMS Mathematics, 2024, 9(6): 15024-15039. doi: 10.3934/math.2024728
[4]	Shaoyuan Xu, Yan Han, Suzana Aleksić, Stojan Radenović . Fixed point results for nonlinear contractions of Perov type in abstract metric spaces with applications. AIMS Mathematics, 2022, 7(8): 14895-14921. doi: 10.3934/math.2022817
[5]	Mohamed Gamal, Tahair Rasham, Watcharaporn Cholamjiak, Fu-Gui Shi, Choonkil Park . New iterative scheme for fixed point results of weakly compatible maps in multiplicative $ {G_{\boldsymbol{M}}}- $metric space via various contractions with application. AIMS Mathematics, 2022, 7(8): 13681-13703. doi: 10.3934/math.2022754
[6]	Abdullah Shoaib, Tahair Rasham, Giuseppe Marino, Jung Rye Lee, Choonkil Park . Fixed point results for dominated mappings in rectangular b-metric spaces with applications. AIMS Mathematics, 2020, 5(5): 5221-5229. doi: 10.3934/math.2020335
[7]	Pragati Gautam, Vishnu Narayan Mishra, Rifaqat Ali, Swapnil Verma . Interpolative Chatterjea and cyclic Chatterjea contraction on quasi-partial $b$-metric space. AIMS Mathematics, 2021, 6(2): 1727-1742. doi: 10.3934/math.2021103
[8]	Qing Yang, Chuanzhi Bai . Fixed point theorem for orthogonal contraction of Hardy-Rogers-type mapping on $O$-complete metric spaces. AIMS Mathematics, 2020, 5(6): 5734-5742. doi: 10.3934/math.2020368
[9]	Muhammad Nazam, Aftab Hussain, Asim Asiri . On a common fixed point theorem in vector-valued $ b $-metric spaces: Its consequences and application. AIMS Mathematics, 2023, 8(11): 26021-26044. doi: 10.3934/math.20231326
[10]	Mohammed Shehu Shagari, Akbar Azam . Integral type contractions of soft set-valued maps with application to neutral differential equations. AIMS Mathematics, 2020, 5(1): 342-358. doi: 10.3934/math.2020023

Abstract

1. Introduction

Let $ \mathcal{H} $ be a real Hilbert space and $ \mathcal{K} $ is a nonempty closed and convex subset of a real Hilbert space $ \mathcal{H} $. The equilibrium problem (EP) is to find an element $ u^* \in \mathcal{K} $ such that

$\begin{align} f(u^*, v) \geqslant 0, \forall v \in \mathcal{K} \end{align}$ $

(1.1)

where $ f : \mathcal{K} \times \mathcal{K} \rightarrow \mathbb{R} $ is a bifunction and satisfying $ f(z, z) = 0 $ for all $ z \in \mathcal{H} $, and $ EP(f, \mathcal{K}) $ is denoted for a solution set of EP (1.1). EP (1.1) generalizes many various problems in optimization problems such as variational inequalities problems, Nash equilibrium problems, linear programming problems, among others.

To solve the problem EP (1.1), the two-step extragradient method (TSEM) was proposed by Tran et al. ^[25] in 2008 which is inspired by the concept ideas to solve variational inequalities of Korpelevich ^[10]. Under the Lipschitz condition of the bifunction $ f $ the convergence theorem was proved. However, only weak convergence was obtained in Hilbert spaces. For obtaining strong convergence, this goal was completed by Hieu ^[8] with Halpern subgradient extragradient method (HSEM) which was modified from the HSEM of Kraikaew and Saejung ^[11] for variational inequalities. This method is defined by $ u\in \mathcal{H} $ and

$\begin{equation} \left\{ \begin{array}{llll} y_{i} & = \arg\min\nolimits_{y\in \mathcal{K}}\left\lbrace \lambda f(x_{i}, y) + \frac{1}{2}\|x_{i}-y\|^{2}\right\rbrace, \\ T_i & = \{v\in \mathcal{H}: \langle (x_i-\lambda t_i)-y_i, v-y_i\rangle \leqslant 0 \}, \ t_i\in \partial_2f(x_i, y_i), \\ z_i & = \arg\min\nolimits_{y\in T_i}\left\lbrace \lambda f(y_{i}, y) + \frac{1}{2}\|x_{i}-y\|^{2}\right\rbrace, \\ x_{i+1} & = \alpha_i u+(1-\alpha_i)z_i, \end{array} \right. \end{equation}$ $

(1.2)

where $ \lambda $ is still some constant depending on the interval that makes the bifunction $ f $ satisfies the Lipschitz condition and $ \{\alpha_i\}\subset (0, 1) $ which satisfies the principle conditions

$\begin{equation} \lim\limits_{i\rightarrow \infty}\alpha_i = 0, \ \ \sum\limits_{i = 1}^\infty\alpha_i = \infty. \end{equation}$ $

(1.3)

Very recently, Muangchoo ^[14] combined a viscosity-type method with the extragradient algorithm for obtaining strong convergence theorem of the EP (1.1). This method is defined by

$\begin{equation} \left\{ \begin{array}{llll} v_i & = \arg\min\nolimits_{v \in \mathcal{K}} \{\lambda_i f(z_i, v) + \frac{1}{2} \|z_i - v\|^2 \}, \\ \varepsilon_i & = \{t \in \mathcal{H} : \langle z_i-\lambda_i \omega_i -v_i, t-v_i \rangle \leqslant 0 \}, \ \omega_i \in \partial_2 f(z_i, v_i), \\ t_i & = \arg\min\nolimits_{v \in \varepsilon_i} \{\mu \lambda_i f(v_i, v) + \frac{1}{2} \|z_i - v\|^2 \}, \\ z_{i+1} & = \alpha_i g(z_i) + (1-\alpha_i)t_i, \end{array} \right. \end{equation}$ $

(1.4)

where $ \mu\in\left(0, \sigma\right)\subset\left(0, \min\left\lbrace1, \frac{1}{2c_{1}}, \frac{1}{2c_{2}}\right\rbrace\right) $, $ g $ is a contraction function on $ \mathcal{H} $ with contraction constant $ \xi\in[0, 1) $($ \|g(x)-g(y)\|\leq\xi\|x-y\|, \ \forall x, y\in \mathcal{H} $), $ \{\alpha_i\} $ satisfies the principle conditions $ \lim\limits_{i \to \infty} \alpha_i = 0 $ and $ \sum_{i}^{+\infty} \alpha_i = +\infty $, and the step-sizes $ \{\lambda_i\} $ is developed by updating the step-sizes method without knowing the Lipschitz-type constants of the bifunction $ f $ which satisfies the following:

$\begin{equation} \lambda_{i+1} = \left\{ \begin{array}{ll} \min\{\sigma, \frac{\mu f(v_i, t_i)}{K_i}\}, \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{if} \ \frac{\mu f(v_i, t_i)}{K_i} > 0, \\ \lambda_0, \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{otherwise}, \end{array} \right. \end{equation}$ $

(1.5)

where $ K_i = f(z_i, t_i)-f(z_i, v_i)-c_{1}\|z_{i}-v_{i}\|^{2}-c_{2}\|t_{i}-v_{i}\|^{2}+1 $.

Finding technique to speed up the convergence of the algorithm is the way that many mathematicians are interested. One of that is an inertial which was first introduced by Polyak ^[16]. Very recently, Shehu et al. ^[22] modified the inertial technique with the Halpern-type algorithm and subgradient extragradient method for obtaining strong convergence to a solution of $ EP(f, \mathcal{K}) $ such that $ f $ is pseudomonotone. This method is defined by $ \gamma \in \mathcal{H} $ and

$\begin{equation} \left\{ \begin{array}{lllll} w_i & = \alpha_i \gamma + (1-\alpha_i)z_i + \delta_i(z_i-z_{i-1}), \\ v_{i} & = \arg\min\nolimits_{v \in \mathcal{K}}\left\lbrace \lambda_i f(w_{i}, v) + \frac{1}{2}\|w_{i}-v\|^{2}\right\rbrace, \\ \varepsilon_i & = \{t \in \mathcal{H} : \langle (w_i-\lambda_i \omega_i) - v_i, t-v_i\rangle\}, \ \ \omega_i \in \partial_2 f(w_i, v_i), \\ t_i & = \arg\min\nolimits_{v \in \varepsilon_i}\left\lbrace \lambda f(v_{i}, v) + \frac{1}{2}\|w_{i}-v\|^{2}\right\rbrace, \\ z_{i+1} & = \tau w_i + (1-\tau)t_i, \end{array} \right. \end{equation}$ $

(1.6)

where the inertial parameter $ \delta_i\in[0, \frac{1}{3}) $, $ \tau\in(0, \frac{1}{2}] $, the update step-size $ \{\lambda_i\} $ satisfies the following:

$\begin{equation} \lambda_{i+1} = \left\{ \begin{array}{ll} \min\{\frac{\mu}{2}\frac{\|w_i-v_i\|^2+\|t_i-v_i\|^2}{f(w_i, t_i)-f(w_i, v_i)-f(v_i, t_i), \lambda_i}\}, \ \ \ \ \ \ \ \ \ \text{if} \ f(w_i, t_i)-f(w_i, v_i)-f(v_i, t_i) > 0, \\ \lambda_i, \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{otherwise}, \end{array} \right. \end{equation}$ $

(1.7)

and $ \{\alpha_i\} $ still satisfies the principle conditions $ \lim\limits_{i \to \infty} \alpha_i = 0 $ and $ \sum_{i}^{\infty} \alpha_i = \infty $. This update step-size $ \{\lambda_i\} $ is limited in the computation and can not be modified in another way.

In this paper, motivated and inspired by the above works in literature, we introduce a modified inertial viscosity extragradient type method for solving equilibrium problems. Moreover, we apply our main result to solve a data classification problem in machine learning and show the performance of our algorithm by comparing with many existing methods.

2. Preliminaries

Let us begin with some concepts of monotonicity of a bifunction ^[2,15]. Let $ \mathcal{K} $ be a nonempty, closed and convex subset of $ \mathcal{H} $. A bifunction $ f : \mathcal{H} \times \mathcal{H} \rightarrow \mathbb{R} $ is said to be:

(i) strongly monotone on $ \mathcal{K} $ if there exists a constant $ \gamma > 0 $ such that

$\begin{align*} f(x, y) + f(y, x) \leqslant -\gamma\|x-y\|^2, \ \forall x, y \in \mathcal{K}; \end{align*}$ $

(ii) mototone on $ \mathcal{K} $ if $ f(x, y) \leqslant - f(y, x), \forall x, y \in \mathcal{K}; $

(iii) pseudomonotone on $ \mathcal{K} $ if $ f(x, y) \geqslant 0 \Longrightarrow f(y, x) \leqslant 0, \forall x, y \in \mathcal{K}; $

(iv) satisfying Lipschitz-type condition on $ \mathcal{K} $ if there exist two positive constants $ c_1, c_2 $ such that

$\begin{align*} c_1\|x-y\|^2 + c_2\|y-z\|^2 \geqslant f(x, z) - f(x, y) - f(y, z), \forall x, y, z \in \mathcal{K}. \end{align*}$ $

The normal cone $ N_\mathcal{K} $ to $ \mathcal{K} $ at a point $ x \in \mathcal{K} $ is defined by $ N_\mathcal{K} (x) = \{ w \in \mathcal{H} : \langle w, x-y \rangle \geqslant 0, \forall y \in \mathcal{K} \} $. For every $ x \in \mathcal{H} $, the metric projection $ P_\mathcal{K} x $ of $ x $ onto $ \mathcal{K} $ is the nearest point of $ x $ in $ \mathcal{K} $, that is, $ P_\mathcal{K} x = \arg\min\{\|y-x\| : y \in \mathcal{K} \} $. Since $ \mathcal{K} $ is nonempty, closed and convex, $ P_\mathcal{K} x $ exists and is unique. For each $ x, z \in \mathcal{H} $, by $ \partial_2 f(z, x) $, we denote the subdifferential of convex function $ f(z, .) $ at $ x $, i.e.,

$\begin{align*} \partial_2 f(z, x) = \{ u \in \mathcal{H} : f(z, y) \geqslant \langle u, y-x \rangle + f(z, x), \ \forall y \in \mathcal{H} \}. \end{align*}$ $

In particular, for $ z \in \mathcal{K} $,

$\begin{align*} \partial_2 f(z, z) = \{ u \in \mathcal{H} : f(z, y) \geqslant \langle u, y-z \rangle , \ \forall y \in \mathcal{H} \}. \end{align*}$ $

For proving the convergence of the proposed algorithm, we need the following lemmas.

Lemma 2.1. ^[1] For each $ x \in \mathcal{H} $ and $ \lambda > 0 $,

$\begin{align*} \frac{1}{\lambda} \langle x - prox_{\lambda g} (x), y - prox_{\lambda g} (x) \rangle \leqslant g(y) - g(prox_{\lambda g} (x)) , \ \forall y \in \mathcal{K}, \end{align*}$ $

where $ prox_{\lambda g} (x) = \arg\min\{ \lambda g(y) + \frac{1}{2} \|x-y\|^2 : y \in \mathcal{K} \} $.

For any point $ u \in \mathcal{H} $, there exists point $ P_\mathcal{K} u \in \mathcal{K} $ such that

$\begin{align*} \|u-P_\mathcal{K} u\| \leqslant \|u-y\|, \ \forall y \in \mathcal{K}. \end{align*}$ $

$ P_\mathcal{K} $ is called the metric projection of $ \mathcal{H} $ onto $ \mathcal{K} $. We know that $ P_\mathcal{K} $ is a nonexpansive mapping of $ \mathcal{H} $ onto $ \mathcal{K} $, that is, $ \|P_\mathcal{K}x-P_\mathcal{K}y\|\leq \|x-y\|, \ \forall x, y \in \mathcal{K} $.

Lemma 2.2. ^[18] Let $ g : \mathcal{K} \rightarrow \mathbb{R} $ be a convex, subdifferentiable and lower semicontinuous function on $ \mathcal{K} $. Suppose the convex set $ \mathcal{K} $ has nonempty interior, or $ g $ is continuous at a point $ x^* \in \mathcal{K} $. Then, $ x^* $ is a solution to the following convex problem $ \min\{g(x) : x \in \mathcal{K}\} $ if and only if $ 0 \in \partial g(x^*) + N_\mathcal{K} (x^*) $, where $ \partial g(.) $ denotes the subdifferential of $ g $ and $ N_\mathcal{K} (x^*) $ is the normal cone of $ \mathcal{K} $ at $ x^* $.

Lemma 2.3. ^[19,22] Let $ S \subseteq \mathbb{R} $ be a nonempty, closed, and convex subset of a real Hilbert space $ \mathcal{H} $. Let $ u \in \mathcal{H} $ be arbitrarity given, $ z = P_S u $, and $ \Omega = \{ x \in \mathcal{H} : \langle x-u, x-z \rangle \leqslant 0 \} $. Then $ \Omega \cap S = \{z\} $.

Lemma 2.4. ^[28] Let $ \{a_i\} $ and $ \{c_i\} $ be nonnegative sequences of real numbers such that $ \sum_{i = 1}^{\infty} c_i < \infty $, and let $ \{b_i\} $ be a sequence of real numbers such that $ \limsup \limits_{i \to \infty} b_i \leqslant 0 $. If for any $ i \in N $ such that

$\begin{align*} a_{i+1} \leqslant (1-\gamma_i)a_i + \gamma_i b_i + c_i, \end{align*}$ $

where $ \{\gamma_i\} $ is a sequence in $ (0, 1) $ such that $ \sum_{i = 1}^{\infty} \gamma_i = \infty $, then $ \lim\limits_{i \to \infty} a_i = 0 $.

Lemma 2.5. ^[29] Let $ \{a_i\} $, $ \{b_i\} $ and $ \{c_i\} $ be positive sequences, such that

$\begin{align*} a_{i+1} \leqslant (1+c_i)a_i + b_i, \ i \geqslant 1. \end{align*}$ $

If $ \sum_{i = 1}^{ \infty} c_i < +\infty $ and $ \sum_{i = 1}^{\infty} b_i < +\infty $; then, $ \lim\nolimits_{i \to +\infty} a_i $ exists.

The convergence of Algorithm 3.1 will be given under the conditions that

Condition 2.6. (A1) $ f $ is pseudomonotone on $ \mathcal{K} $ with $ int(\mathcal{K}) \neq \emptyset $ or $ f(x, .) $ is continuous at some $ z \in \mathcal{K} $ for every $ x \in \mathcal{K} $;

(A2) $ f $ satisfies Lipschitz-type condition on $ \mathcal{H} $ with two constants $ c_1 $ and $ c_2 $;

(A3) $ f(., y) $ is sequentially weakly upper semicontinuous on $ \mathcal{K} $ for each fixed point $ y \in \mathcal{K} $, i.e., if $ \{x_i\} \subset \mathcal{K} $ is a sequence converging weakly to $ x \in \mathcal{K} $, then $ f(x, y) \geqslant \limsup \limits_{i \to \infty} f(x_i, y) $;

(A4) for $ x \in \mathcal{H} $, $ f(x, .) $ is convex and lower semicontinuous, subdifferentiable on $ \mathcal{H} $;

(A5) $ V : \mathcal{H} \rightarrow \mathcal{H} $ is contraction with contraction constant $ \alpha $.

3. Main results

Now, we are in a position to present a modification of algorithm (EGM) in ^[25] for equilibrium problems.

Algorithm 3.1. (Modified viscosity type inertial extragradient algorithm for EPs)
Initialization. Let $ x_0, x_1 \in H $, $ 0 < \lambda_i \leqslant \lambda < \frac{1}{2\max\{c_1, c_2\}} $, $ \tau \in (0, \frac{1}{2}] $.
Step 1. Given the current iterates $ x_{i-1} $ and $ x_i (i \geqslant 1) $ and $ \alpha_i \in (0, 1) $, $ \theta_i \in [0, \frac{1}{3}) $, compute
$ $\begin{align} w_i & = \alpha_i V(x_i) + (1-\alpha_i)x_i + \theta_i(x_i-x_{i-1}), \\ y_i & = \arg\min_{y \in \mathcal{K}} \{ \lambda_i f(w_i, y) + \frac{1}{2} \\|y-w_i\\|^2 \}. \end{align}$ $
If $ y_i = w_i $ then stop and $ y_i $ is a solution. Otherwise, go to Step 2.
Step 2. Compute
$ $\begin{align} z_i = \arg\min_{y \in \mathcal{K}} \{ \lambda_i f(y_i, y) + \frac{1}{2} \\|y-w_i\\|^2 \}. \end{align}$ $
Step 3. Compute
$ $\begin{align} x_{i+1} = (1-\tau)w_i + \tau z_i. \end{align}$ $
Set $ i = i+1 $ and return to Step 1.

In this section, we will analyse the convergence of Algorithm 3.1.

For the rest of this paper, we assume the following condition.

Condition 3.2. (i) $ \{\alpha_i\} \subset (0, 1] $ is non-increasing with $ \lim\limits_{i \to \infty}{\alpha_i} = 0 $ and $ \sum\limits_{i = 1}^{\infty}{\alpha_i} = \infty $;

(ii) $ 0 \leq \theta_{i} \leq \theta_{i+1} \leq \theta < \frac{1}{3} $ and $ \lim\limits_{i \to \infty} \frac{\theta_i}{\alpha_i} \|x_i - x_{i-1} \| = 0 $;

(iii) $ EP(f, \mathcal{K}) \not = \emptyset $.

Before we prove the strong convergence result, we need some lemmas below.

Lemma 3.3. Assume that Conditions 2.6 and 3.2 hold. Let $ \{x_{i}\} $ be generated by Algorithm 3.1. Then there exists $ N > 0 $ such that

$\begin{equation} \|x_{i+1} - u \|^2 \leq \|w_{i} - u\|^2 - \|x_{i+1} - w_{i}\|^2, \ \forall u \in EP(f, \mathcal{K}), \ i \geq N. \end{equation}$ $

(3.1)

Proof. By the definition of $ y_i $, and Lemma 2.1, we have

$\begin{equation} \frac{1}{\lambda_i} \langle w_i - y_i, y - y_i\rangle \leqslant f(w_i, y) - f(w_i, y_i) , \ \forall y \in \mathcal{K}. \end{equation}$ $

(3.2)

Putting $ y = z_{i} $ into (3.2), we obtain

$\begin{equation} \frac{1}{\lambda_i} \langle y_i - w_i, y_i - z_i\rangle \leqslant f(w_i, z_i) - f(w_i, y_i). \end{equation}$ $

(3.3)

By the definition of $ z_i $, we have

$\begin{equation} \frac{1}{\lambda_i} \langle w_i - z_i, y - z_i\rangle \leqslant f(y_i, y) - f(y_i, z_i) , \ \forall y \in \mathcal{K}. \end{equation}$ $

(3.4)

(3.3) and (3.4) imply that

$\begin{equation} \frac{2}{\lambda_i} \langle w_i - z_i, y - z_i\rangle + \frac{2}{\lambda_i} \langle y_i-w_i, y_i-z_i\rangle \leqslant 2f(y_i, y) + 2(f(w_i, z_i) - f(w_i, y_i) - f(y_i, z_i)). \end{equation}$ $

(3.5)

If $ f(w_i, z_i) - f(w_i, y_i) - f(y_i, z_i) > 0 $, then

$\begin{equation} f(w_i, z_i) - f(w_i, y_i) - f(y_i, z_i) \leq c_1\|w_i-y_i\|^2 + c_2\|z_i-y_i\|^2 \end{equation}$ $

(3.6)

Observe that (3.6) is also satisfied if $ f(w_i, z_i) - f(w_i, y_i) - f(y_i, z_i) \leq 0 $. By (3.5) and (3.6), we have

$\begin{equation} \langle w_i-z_i, y-z_i\rangle + \langle y_i-w_i, y_i-z_i\rangle \leqslant \lambda_i f(y_i, y) + \lambda_i c_1 \|w_i-y_i\|^2 + \lambda_i c_2 \|z_i-y_i\|^2. \end{equation}$ $

(3.7)

Note that

$\begin{equation} \langle w_i-z_i, z_i-y\rangle = \frac{1}{2}(\|w_i-y\|^2 - \|w_i-z_i\|^2 - \|z_i-y\|^2). \end{equation}$ $

(3.8)

and

$\begin{equation} \langle w_i-y_i, z_i-y_i\rangle = \frac{1}{2}(\|w_i-y_i\|^2 + \|z_i-y_i\|^2 - \|w_i-z_i\|^2). \end{equation}$ $

(3.9)

Using (3.8) and (3.9) in (3.7), we obtain, for all $ y \in \mathcal{K} $,

$\begin{equation} \|z_i-y\|^2 \leqslant \|w_i-y\|^2 - (1-2\lambda_i c_1)\|w_i-y_i\|^2 - (1-2\lambda_i c_2)\|z_i-y_i\|^2 + 2\lambda_i f(y_i, y). \end{equation}$ $

(3.10)

Taking $ y = u \in EP(f, \mathcal{K}) \subset \mathcal{K} $, one has $ f(u, y_i) \geqslant 0, \forall i $. By (A1), we obtain $ f(y_i, u) \leqslant 0, \ \forall i $. Hence, we obtain from (3.10) that

$\begin{equation} \|z_i-u\|^2 \leqslant \|w_i-u\|^2 - (1-2\lambda_i c_1)\|w_i-y_i\|^2 - (1-2\lambda_i c_2)\|z_i-y_i\|^2. \end{equation}$ $

(3.11)

It follows from $ \lambda_i \in (0, \frac{1}{2\max\{c_1, c_2\}}) $ and (3.11), we have

$\begin{equation} \|z_i-u\| \leqslant \|w_i-u\|. \end{equation}$ $

(3.12)

On the other hand, we have

$\begin{equation} \|x_{i+1} - u\|^2 = (1-\tau)\|w_i - u\|^2 + \tau\|z_i - u\|^2 - (1-\tau)\tau\|z_i - w_i\|^2. \end{equation}$ $

(3.13)

Substituting (3.11) into (3.13), we obtain

$\begin{align} \|x_{i+1} - u\|^2 &\leq \|w_i - u\|^2 - \tau\|w_i - u\|^2 + \tau\|w_i - u\|^2 - \tau(1-2\lambda_i c_1)\|w_i-y_i\|^2 \\ &\quad - \tau(1-2\lambda_i c_2)\|z_i-y_i\|^2 - (1-\tau)\tau\|z_i-w_i\|^2. \end{align}$ $

(3.14)

Moreover, we have $ z_i-w_i = \frac{1}{\tau}(x_{i+1}-w_i) $, which together with (3.14) gives

$\begin{align} \|x_{i+1} - u\|^2 &\leq \|w_i - u\|^2 - \tau(1-2\lambda_i c_1)\|w_i-y_i\|^2 - \tau(1-2\lambda_i c_2)\|z_i-y_i\|^2 \\ &\quad- (1-\tau)\tau\frac{1}{\tau^2}\|x_{i+1}-w_i\|^2 \\ &\leq \|w_i - u\|^2 - \frac{1-\tau}{\tau}\|x_{i+1}-w_i\|^2 \\ &\leq \|w_i - u\|^2 - \epsilon\|x_{i+1}-w_i\|^2, \ \forall i \geqslant N, \end{align}$ $

(3.15)

where $ \epsilon = \frac{1-\tau}{\tau} $.

Lemma 3.4. Assume that Conditions 2.6 and 3.2 hold. Let $ \{x_i\} $ be generated by Algorithm 3.1. Then, for all $ u \in EP(f, \mathcal{K}) $,

$\begin{align} -2\alpha_i \langle x_i-u, x_i-V(x_i) \rangle &\geqslant \|x_{i+1}-u\|^2 - \|x_i-u\|^2 + 2\theta_{i+1}\|x_{i+1}-x_i\|^2 - 2\theta_{i}\|x_{i}-x_{i-1}\|^2 \\ &\quad + \alpha_{i+1}\|V(x_i)-x_{i+1}\|^2 - \alpha_i\|x_i-V(x_i)\|^2 - \theta_i \|x_i-u\|^2 \\ &\quad + \theta_{i-1}\|x_{i-1}-u\|^2 + (1-3\theta_{i+1}-\alpha_i)\|x_i-x_{i+1}\|^2. \end{align}$ $

(3.16)

Proof. By Lemma 2.5, we have

$\begin{equation} \|x_{i+1}-u\|^2 \leq \|w_i-u\|^2 - \|x_{i+1}-w_i\|^2. \end{equation}$ $

(3.17)

Moreover, from the definition of $ w_i $, we obtain that

$\begin{align} \|w_i-u\|^2 & = \|x_i-u\|^2 + \|\theta_i(x_i-x_{i-1}) - \alpha_i(x_i-V(x_i))\|^2 \\ &\quad + 2\langle x_i-u, \theta_i(x_i-x_{i-1}) - \alpha_i(x_i-V(x_i))\rangle \\ & = \|x_i-u\|^2 + \|\theta_i(x_i-x_{i-1}) - \alpha_i(x_i-V(x_i))\|^2 \\ &\quad + 2\theta_i\langle x_i-u, x_i-x_{i-1}\rangle - 2\alpha_i\langle x_i-u, x_i-V(x_i)\rangle. \end{align}$ $

(3.18)

Replacing $ u $ by $ x_{i+1} $ in (3.18), we have

$\begin{align} \|w_i-x_{i+1}\|^2 & = \|x_i-x_{i+1}\|^2 + \|\alpha_i(x_i-V(x_i)) - \theta_i(x_i-x_{i-1})\|^2 \\ &\quad + 2\theta_i\langle x_i-x_{i+1}, x_i-x_{i-1}\rangle - 2\alpha_i\langle x_i-x_{i+1}, x_i-V(x_i)\rangle. \end{align}$ $

(3.19)

Substituting (3.18) and (3.19) into (3.17), we have

$\begin{align} \|x_{i+1}-u\|^2 &\leq \|x_i-u\|^2 + \|\theta_i(x_i-x_{i-1}) - \alpha_i(x_i-V(x_i))\|^2 + 2\theta_i\langle x_i-u, x_i-x_{i-1}\rangle \\ &\quad - 2\alpha_i\langle x_i-u, x_i-V(x_i)\rangle - \|x_i-x_{i+1}\|^2 - 2\theta_i\langle x_i-x_{i+1}, x_i-x_{i-1}\rangle \\ &\quad + 2\alpha_i\langle x_i-x_{i+1}, x_i-V(x_i)\rangle - \|\alpha_i(x_i-V(x_i))-\theta_i(x_i-x_{i-1})\|^2 \\ & = \|x_i-u\|^2 - \|x_i-x_{i+1}\|^2 + 2\theta_i\langle x_i-u, x_i-x_{i-1}\rangle - 2\alpha_i\langle x_i-u, x_i-V(x_i)\rangle \\ &\quad + 2\alpha_i\langle x_i-x_{i+1}, x_i-V(x_i)\rangle + \theta_i\|x_i-x_{i+1}\|^2 + \theta_i\|x_i-x_{i-1}\|^2 \\ &\quad -\theta_i\|x_i-x_{i+1}+(x_i-x_{i-1})\|^2. \end{align}$ $

(3.20)

Therefore, we obtain

$\begin{align} &\|x_{i+1}-u\|^2 - \|x_i-u\|^2 - \theta_i\|x_i-x_{i-1}\|^2 + \|x_i-x_{i+1}\|^2 - \theta_i\|x_i-x_{i+1}\|^2 \\ &\leq 2\theta_i \langle x_i-u, x_i-x_{i-1} \rangle -2\alpha_i \langle x_i-u, x_i-V(x_i) \rangle + 2\alpha_i \langle x_i-x_{i+1}, x_i-V(x_i) \rangle \\ & = -2\alpha_i \langle x_i-u, x_i-V(x_i) \rangle - \theta_i\|x_{i-1}-u\|^2 + \theta_i\|x_i-u\|^2 + \theta_i\|x_i-x_{i-1}\|^2 \\ &\quad -\alpha_i\|V(x_i)-x_{i+1}\|^2 + \alpha_i\|x_{i+1}-x_i\|^2 + \alpha_i\|x_i-V(x_i)\|^2. \end{align}$ $

(3.21)

It follows that

$ -2\alpha_i \langle x_i-u, x_i-V(x_i) \rangle \geqslant \|x_{i+1}-u\|^2 - \|x_i-u\|^2 - \theta_i\|x_i-x_{i-1}\|^2 + \|x_i-x_{i+1}\|^2 \\ \quad -\theta_i\|x_i-x_{i+1}\|^2 + \theta_i\|x_{i-1}-u\|^2 - \theta_i\|x_i-u\|^2 - \theta_i\|x_i-x_{i-1}\|^2 \\ \quad +\alpha_i\|V(x_i)-x_{i+1}\|^2 - \alpha_i\|x_{i+1}-x_i\|^2 - \alpha_i\|x_i-V(x_i)\|^2 \\ \geqslant \|x_{i+1}-u\|^2 - \|x_i-u\|^2 + 2\theta_{i+1}\|x_{i+1}-x_i\|^2 - 2\theta_i\|x_i-x_{i-1}\|^2 \\ \quad +\theta_i(\|x_{i-1}-u\|^2 - \|x_i-u\|^2) + \alpha_i(\|V(x_i)-x_{i+1}\|^2 - \|x_i-V(x_i)\|^2) \\ \quad +(1-\theta_i-2\theta_{i+1}-\alpha_i)\|x_{i+1}-x_i\|^2. $

(3.22)

Since $ {\theta_i} $ is non-decreasing and $ \alpha_i $ is non-increasing, we then obtain

$\begin{align*} -2\alpha_i \langle x_i-u, x_i-V(x_i) \rangle \notag &\geqslant \|x_{i+1}-u\|^2 - \|x_i-u\|^2 + 2\theta_{i+1}\|x_{i+1}-x_i\|^2 - 2\theta_i\|x_i-x_{i-1}\|^2 \notag \\ &\quad +\alpha_{i+1}\|V(x_i)-x_{i+1}\|^2 - \alpha_i\|x_i-V(x_i)\|^2 - \theta_i\|x_i-u\|^2 \notag \\ &\quad +\theta_{i-1}\|x_{i-1}-u\|^2 + (1-3\theta_{i+1}-\alpha_i)\|x_i-x_{i+1}\|^2. \end{align*}$ $

Lemma 3.5. Assume that Conditions 2.6 and 3.2 hold. Then $ \{x_i\} $ generated by Algorithm 3.1 is bounded.

Proof. From (3.15) and Condition 3.2 (ii), there exists $ K > 0 $ such that

$\begin{align} \|x_{i+1}-u\| &\leqslant \|w_i-u\| \\ & = \|\alpha_i V(x_i)+(1-\alpha_i)x_i + \theta_i(x_i-x_{i-1}) -u\| \\ &\leqslant \alpha_i\|V(x_i)-u\| + (1-\alpha_i)\|x_i-u\| + \theta_i\|x_i-x_{i-1}\| \\ & = \alpha_i\|V(x_i)-u\| + (1-\alpha_i)\|x_i-u\| + \alpha_i \frac{\theta_i}{\alpha_i} \|x_i-x_{i-1}\| \end{align}$ $

(3.23)

$\begin{align} &\leqslant \alpha_i\|V(x_i)-u\| + (1-\alpha_i)\|x_i-u\| + \alpha_i K \\ &\leqslant \alpha_i(\|V(x_i)-V(u)\| + \|V(u)-u\|) + (1-\alpha_i)\|x_i-u\| + \alpha_i K \end{align}$ $

(3.24)

$\begin{align} &\leqslant (1-\alpha_i (1-\alpha))\|x_i-u\| + \alpha_i (1-\alpha) (\frac{\|V(u)-u\| + K}{1-\alpha}) \end{align}$ $

(3.25)

$\begin{align} &\leqslant max\{\|x_i-u\|, \frac{\|V(u)-u\| + K}{1-\alpha} \}. \end{align}$ $

(3.26)

This implies that $ \|x_{i+1}-u\| \leqslant max\{\|x_1-u\|, \frac{\|V(u)-u\| + K}{1-\alpha} \}. $ This shows that $ \{x_i\} $ is bounded.

Lemma 3.6. Assume that Conditions 2.6 and 3.2 hold. Let $ \{x_i\} $ be generated by Algorithm 3.1. For each $ i \geqslant 1 $, define

$\begin{align*} u_i = \|x_i-u\|^2 - \theta_{i-1}\|x_{i-1}-u\|^2 + 2\theta_i\|x_i-x_{i-1}\|^2 + \alpha_i\|x_i-V(x_i)\|^2. \end{align*}$ $

Then $ u_i \geqslant 0 $.

Proof. Since $ \{\theta_i\} $ is non-decreasing with $ 0 \leqslant \theta_i < \frac{1}{3} $, and $ 2\langle x, y \rangle = \|x\|^2 + \|y\|^2 -\|x-y\|^2 $ for all $ x, y \in H $, we have

$\begin{align*} u_i & = \|x_i-u\|^2 - \theta_{i-1}[\|x_{i-1}-x_i\|^2 + \|x_i-u\|^2 + 2\langle x_{i-1}-x_i, x_i-u \rangle] \\ &\quad +2\theta_i\|x_i-x_{i-1}\|^2 + \alpha_i\|x_i-V(x_i)\|^2 \\ & = \|x_i-u\|^2 - \theta_{i-1}[2\|x_{i-1}-x_i\|^2 + 2\|x_i-u\|^2 - \|x_{i-1}-2x_i+u\|^2] \\ &\quad +2\theta_i\|x_i-x_{i-1}\|^2 + \alpha_i\|x_i-V(x_i)\|^2 \\ &\geqslant \|x_i-u\|^2 - 2\theta_i\|x_{i-1}-x_i\|^2 - \frac{2}{3}\|x_i-u\|^2 + \theta_{i-1}\|x_{i-1}-2x_i+u\|^2 \\ &\quad +2\theta_i\|x_i-x_{i-1}\|^2 + \alpha_i\|x_i-V(x_i)\|^2 \\ &\geqslant \frac{1}{3}\|x_i-u\|^2 + \alpha_i\|x_i-V(x_i)\|^2 \\ &\geqslant 0. \end{align*}$ $

This completes the proof.

Lemma 3.7. Assume that Conditions 2.6 and 3.2 hold. Let $ \{x_i\} $ be generated by Algorithm 3.1. Suppose

$\begin{align*} \lim\limits_{i \to \infty} \|x_{i+1}-x_i\| = 0, \end{align*}$ $

and

$\begin{align*} \lim\limits_{i \to \infty} (\|x_{i+1}-u\|^2 - \theta_i\|x_i-u\|^2) = 0. \end{align*}$ $

Then $ \{x_i\} $ converges strongly to $ u \in EP(f, \mathcal{K}) $.

Proof. By our assumptions, we have

$\begin{align} 0 & = \lim\limits_{i \to \infty} (\|x_{i+1}-u\|^2 - \theta_i\|x_i-u\|^2) \\ & = \lim\limits_{i \to \infty} [(\|x_{i+1}-u\| + \sqrt{\theta_i} \|x_i-u\|)(\|x_{i+1}-u\| - \sqrt{\theta_i} \|x_i-u\|)]. \end{align}$ $

(3.27)

In the case

$\begin{align*} \lim\limits_{i \to \infty} (\|x_{i+1}-u\| + \sqrt{\theta_i} \|x_i-u\|) = 0, \end{align*}$ $

this implies that $ \{x_i\} $ converges strongly to $ u $ immediately. Assume this limit does not hold. Then there is a subset $ N^* \subseteq N $ and a constant $ \rho > 0 $ such that

$\begin{align} \|x_{i+1}-u\| + \sqrt{\theta_i} \|x_i-u\| \geqslant \rho, \ \forall i \in N^*. \end{align}$ $

(3.28)

Using (3.27) and $ \theta_i \leqslant \theta < 1 $. For $ i \in N^* $, it then follows that

$\begin{align*} 0 & = \lim\limits_{i \rightarrow \infty} (\|x_{i+1}-u\| - \sqrt{\theta_i} \|x_i-u\|)\\ &\geqslant \limsup\limits_{i \rightarrow \infty} (\|x_i-u\| - \|x_{i+1}-x_i\| - \sqrt{\theta_i} \|x_i-u\|) \\ &\geqslant \limsup\limits_{i \rightarrow \infty} ((1-\sqrt{\theta})\|x_i-u\| - \|x_{i+1}-x_i\|) \\ & = (1-\sqrt{\theta}) \limsup\limits_{i \rightarrow \infty} \|x_i-u\| - \lim\limits_{i \rightarrow \infty} \|x_{i+1}-x_i\| \\ & = (1-\sqrt{\theta}) \limsup\limits_{i \rightarrow \infty} \|x_i-u\|. \end{align*}$ $

Consequently, we have $ \limsup \limits_{i \rightarrow \infty} \|x_i-u\| \leqslant 0 $. Since $ \liminf_{i \rightarrow \infty} \|x_i-u\| \geqslant 0 $ obviously holds, it follows that $ \lim\limits_{i \rightarrow \infty} \|x_i-u\| = 0 $. This implies (by (3.28))

$\begin{align*} \|x_{i+1}-x_i\| &\geqslant \|x_{i+1}-u\| - \|x_i-u\| \\ & = \|x_{i+1}-u\| + \sqrt{\theta_i} \|x_i-u\| - (1+\sqrt{\theta_i})\|x_i-u\| \\ &\geqslant \frac{\rho}{2}, \end{align*}$ $

for all $ i \in N^* $ sufficiently large, a contradiction to the assumption that $ \lim\limits_{i \to \infty} \|x_{i+1}-x_i\| = 0 $. This completes the proof.

We now give the following strong convergence result of Algorithm 3.1.

Theorem 3.8. Assume that Conditions 2.6 and 3.2 hold. Then $ \{x_i\} $ generated by Algorithm 3.1 strongly converges to the solution $ u = P_{EP(f, \mathcal{K})} V(u) $.

Proof. From Lemma 3.6 and (3.16), we have

$\begin{align} &u_{i+1}-u_i-\alpha_{i+1}\|x_{i+1}-V(x_{i+1})\|^2+\alpha_{i+1}\|V(x_i)-x_{i+1}\|^2+(1-3\theta_{i+1}-\alpha_i)\|x_i-x_{i+1}\|^2 \\ &\leqslant -2\alpha_i \langle x_i-u, x_i-V(x_i) \rangle. \end{align}$ $

(3.29)

Since $ P_{EP(f, \mathcal{K})} V $ is contraction, by the Banach fixed point theorem, there exist unique $ u = P_{EP(f, \mathcal{K})} V(u) $. It follow from Lemma 3.3 that

$ \|x_{i+1}-u\|^2 \leqslant \|w_i-u\|^2 \\ = \|\alpha_i(V(x_i)-u)+(1-\alpha_i)(x_i-u)+\theta_i(x_i-x_{i-1})\|^2 \\ \leqslant \|(1-\alpha_i)(x_i-u)+\theta_i(x_i-x_{i-1})\|^2 + 2\langle \alpha_i(V(x_i)-u), w_i-u \rangle \\ = \|(1-\alpha_i)(x_i-u)+\theta_i(x_i-x_{i-1})\|^2 + 2\alpha_i \langle V(x_i)-V(u), w_i-u \rangle + 2\alpha_i \langle V(u)-u, w_i-u \rangle \\ \leqslant \|(1-\alpha_i)(x_i-u)+\theta_i(x_i-x_{i-1})\|^2 + 2\alpha_i \langle V(u)-u, w_i-u \rangle + 2\alpha_i \alpha \| x_i-u\| \|w_i-u\| \\ \leqslant \|(1-\alpha_i)(x_i-u)+\theta_i(x_i-x_{i-1})\|^2 + 2\alpha_i \langle V(u)-u, w_i-u \rangle + \alpha_i \alpha ( \| x_i-u\|^2 + \|w_i-u\|^2 ) \\ \leqslant \frac{1}{1-\alpha_i \alpha} \bigg(\|(1-\alpha_i)(x_i-u) + \theta_i(x_i-x_{i-1})\|^2 + \alpha_i \alpha \|x_i-u\|^2 + 2\alpha_i\langle V(u)-u, w_i-u \rangle \bigg) \\ \leqslant \frac{1}{1-\alpha_i \alpha}\bigg( \|(1-\alpha_i)(x_i-u)\|^2 + 2\langle \theta_i(x_i-x_{i-1}), (1-\alpha_i)(x_i-u) +\theta_i(x_i-x_{i-1}) \rangle \\ \quad +\alpha_i \alpha \|x_i-u\|^2 + 2\alpha_i\langle V(u)-u, w_i-u \rangle \bigg) \\ = \frac{(1-\alpha_i)^2+\alpha_i \alpha}{1-\alpha_i \alpha}\|x_i-u\|^2 + \frac{1}{1-\alpha_i \alpha}\bigg(2\langle\theta_i(x_i-x_{i-1}), (1-\alpha_i)(x_i-u)+\theta_i(x_i-x_{i-1})\rangle \\ \quad + 2\alpha_i \langle V(u)-u, w_i-u\rangle \bigg) \\ = \big(1-(\frac{2\alpha_i(1-\alpha)}{1-\alpha_i \alpha} - \frac{(\alpha_i)^2}{1-\alpha_i \alpha} )\big)\|x_i-u\|^2+ \frac{1}{1-\alpha_i \alpha}\bigg( 2\langle \theta_i(x_i-x_{i-1}), (1-\alpha_i)(x_i-u) \\ \quad + \theta_i(x_i-x_{i-1}) \rangle + 2\alpha_i\langle V(u)-u, w_i-u \rangle \bigg) \\ \leqslant \big(1-\frac{2\alpha_i(1-\alpha)}{1-\alpha_i \alpha} \big)\|x_i-u\|^2 + \frac{2\alpha_i(1-\alpha)}{1-\alpha_i \alpha} \bigg( \frac{\alpha_i}{2(1-\alpha)}\|x_i-u\|^2 \\ \quad + \frac{1}{\alpha_i(1-\alpha)}\langle \theta_i(x_i-x_{i-1}), (1-\alpha_i)(x_i-u)+ \theta_i(x_i-x_{i-1}) \rangle \\ \quad + \frac{1}{1-\alpha}\langle V(u)-u, w_i-u \rangle \bigg). $

(3.30)

We will consider into 2 cases.

Case 1. In the case of $ u_{i+1} \leqslant u_i+t_i $ for all $ i \geqslant i_0 $ for some $ i_0 \in \mathbb{N} $, $ t_i \geqslant 0 $ and $ \sum_{i = 1}^{\infty} t_i < +\infty $. Then $ u_i \geqslant 0, \ \forall i \geqslant 1 $ by Lemma 2.5, we have $ \lim\limits_{i \to \infty} u_i = \lim\limits_{i \to \infty} u_{i+1} $ exists. Since $ \{x_i\} $ is bounded by Lemma 3.5, there exists $ M_1 > 0 $ such that $ 2|\langle x_i-u, x_i-V(x_i)\rangle| \leqslant M_1 $ and $ M_2 > 0 $ such that $ \|x_{i+1}-V(x_{i+1})\|^2 + \|V(x_i)-x_{i+1}\|^2 \leqslant M_2 $. Since $ 0 \leqslant \theta_i \leqslant \theta_{i+1} \leqslant \theta < \frac{1}{3} $ and $ \lim\limits_{i \to \infty} \alpha_i = 0 $, there exist $ N \in \mathbb{N} $ and $ \gamma_1 > 0 $ such that $ 1-3\theta_{i+1}-\alpha_i \geqslant \gamma_1 $ for all $ i \geqslant N $. Therefore, for $ i \geqslant N $, we obtain from (3.29) that

$\begin{align} \gamma_1 \|x_{i+1}-x_i\|^2 &\leqslant \alpha_i M_1 + \alpha_{i+1}M_2 + u_i - u_{i+1} \rightarrow 0, \end{align}$ $

(3.31)

as $ \ i \rightarrow \infty $. Hence $ \lim\limits_{i \to \infty} \|x_{i+1}-x_i\| = 0 $. For $ u \in EP(f, \mathcal{K}) $, we have

$\begin{align} \|w_i-u\|^2 & = \|\alpha_i V(x_i) + (1-\alpha_i)x_i + \theta_i(x_i-x_{i-1}) - u\|^2 \\ &\leqslant \|\alpha_i V(x_i) + (1-\alpha_i)x_i-u\|^2 + 2\langle \theta_i(x_i-x_{i-1}), w_i-u \rangle \\ &\leqslant \alpha_i\|V(x_i)-u\|^2 + (1-\alpha_i)\|x_i-u\|^2 + 2\theta_i\|x_i-x_{i-1}\|\|w_i-u\| \\ &\leqslant \alpha_i\|V(x_i)-u\|^2 + (1-\alpha_i)\|x_i-u\|^2 + 2\frac{\theta_i}{\alpha_i}\|x_i-x_{i-1}\|\|w_i-u\| \\ &\leqslant \alpha_i\|V(x_i)-u\|^2 + \|x_i-u\|^2 + 2\frac{\theta_i}{\alpha_i}\|x_i-x_{i-1}\|\|w_i-u\|, \end{align}$ $

(3.32)

and from (3.14), we have

$ \|x_{i+1}-u\|^2 = \|w_i-u\|^2 - \tau(1-2\lambda_i c_1)\|w_i-y_i\|^2 - \tau(1-2\lambda_i c_2)\|z_i-y_i\|^2 \\ \quad -(1-\tau)\tau\frac{1}{\tau^2}\|x_{i+1}-w_i\|^2 \\ \leqslant \alpha_i\|V(x_i)-u\|^2 + \|x_i-u\|^2 + 2\frac{\theta_i}{\alpha_i}\|x_i-x_{i-1}\|\|w_i-u\| - \tau(1-2\lambda_i c_1)\|w_i-y_i\|^2 \\ \quad - \tau(1-2\lambda_i c_2)\|z_i-y_i\|^2 - \frac{1-\tau}{\tau}\|x_{i+1}-w_i\|^2. $

(3.33)

This implies that

$\begin{align} &\tau(1-2\lambda_i c_1)\|w_i-y_i\|^2 + \tau(1-2\lambda_i c_2)\|z_i-y_i\|^2 + \frac{1-\tau}{\tau}\|x_{i+1}-w_i\|^2 \\ &\leqslant \alpha_i\|V(x_i)-u\|^2 + \|x_i-u\|^2 + 2\frac{\theta_i}{\alpha_i}\|x_i-x_{i-1}\|\|w_i-u\| - \|x_{i+1}-u\|^2. \end{align}$ $

(3.34)

By our condition and (3.31), we obtain

$\begin{align} \lim\limits_{i \to \infty} \|w_i-y_i\| = \lim\limits_{i \to \infty} \|z_i-y_i\| = \lim\limits_{i \to \infty} \|x_{i+1}-w_i\| = 0. \end{align}$ $

(3.35)

Since $ \{x_i\} $ is bounded, that is, there exits a subsequence $ \{x_{i_k}\} $ of $ \{x_i\} $ such that $ x_{i_k} \rightharpoonup x^* $ for some $ x^* \in H $. From (3.31) and (3.35), we get $ w_{i_k} \rightharpoonup x^* $ and $ y_{i_k} \rightharpoonup x^* $ as $ k \rightarrow \infty $.

By the definition of $ z_i $ and (3.6), we have

$\begin{align*} \lambda_{i_k} f(y_{i_k}, y) &\geqslant \lambda_{i_k} f(y_{i_k}, z_{i_k}) + \langle w_{i_k} - z_{i_k}, y - z_{i_k} \rangle \\ &\geqslant \lambda_{i_k} f(w_{i_k}, z_{i_k}) - \lambda_{i_k} f(w_{i_k}, y_{i_k}) -c_1\|w_{i_k} - y_{i_k}\|^2 - c_2\|z_{i_k} - y_{i_k}\|^2 \\ &\quad +\langle w_{i_k} - z_{i_k}, y - z_{i_k} \rangle \\ &\geqslant \langle y_{i_k} - w_{i_k}, y_{i_k} - z_{i_k} \rangle + \langle w_{i_k} - z_{i_k}, y - z_{i_k} \rangle -c_1\|w_{i_k} - y_{i_k}\|^2 - c_2\|z_{i_k} - y_{i_k}\|^2. \end{align*}$ $

It follows from $ \{z_{i_k}\} $ is bounded, $ 0 < \lambda_{i_k} \leqslant \lambda < \frac{1}{2\max\{c_1, c_2\}} $ and Condition 2.6 (A3) that $ 0 \leqslant \limsup \limits_{k \to \infty} f(y_{i_k}, y) \leqslant f(x^*, y) $ for all $ y \in H $. This implies that $ f(x^*, y) \geqslant 0 $ for all $ y \in \mathcal{K} $, this shows that $ x^* \in EP(f, \mathcal{K}) $. Then, we have

$\begin{align} \limsup\limits_{i \to \infty} \langle V(u)-u, w_i-u \rangle & = \lim\limits_{k \to \infty} \langle V(u)-u, w_{i_k} - u \rangle \\ & = \langle V(u)-u, x^* - u \rangle \leqslant 0, \end{align}$ $

(3.36)

by $ u = P_{EP(f, \mathcal{K})} V(u) $. Applying (3.36) to the inequality (3.30) with Lemma 2.4, we can conclude that $ x_i \rightarrow u = P_{EP(f, \mathcal{K})} V(u) $ as $ i \rightarrow \infty $.

Case 2. In another case of $ \{u_i\} $, we let $ \phi : \mathbb{N} \rightarrow \mathbb{N} $ be the map defined for all $ i \geqslant i_0 $ (for some $ i_0 \in \mathbb{N} $ large enough) by

$\begin{align} \phi(i) = \max\{k \in \mathbb{N} : k \leqslant i, u_k+t_k \leqslant u_{k+1}\}. \end{align}$ $

(3.37)

Clearly, $ \phi(i) $ is a non-decreasing sequence such that $ \phi(i) \rightarrow \infty $ for $ i \rightarrow \infty $ and $ u_{\phi(i)}+t_{\phi(i)} \leqslant u_{\phi(i)+1} $ for all $ i \geqslant i_0 $. Hence, similar to the proof of Case 1, we therefore obtain from (3.31) that

$\begin{align} \gamma_1 \|x_{\phi(i)+1} - x_{\phi(i)}\|^2 \leqslant \alpha_{\phi(i)} M_1 + \alpha_{\phi(i)+1} M_2 + u_{\phi(i)} - u_{\phi(i)+1} \rightarrow 0 \end{align}$ $

(3.38)

for some constant $ M_1, M_2 > 0 $. Thus

$\begin{align} \lim\limits_{i \to \infty}\|x_{\phi(i)+1} - x_{\phi(i)}\| = 0. \end{align}$ $

(3.39)

By the same proof of Case 1, one also derive

$\begin{align} \lim\limits_{i \to \infty}\|x_{\phi(i)+1} - w_{\phi(i)}\| = \lim\limits_{i \to \infty}\|w_{\phi(i)} - x_{\phi(i)}\| = \lim\limits_{i \to \infty}\|x_{\phi(i)} - z_{\phi(i)}\| = 0. \end{align}$ $

(3.40)

Again observe that for $ j \geqslant 0 $ by (3.29), we have $ u_{j+1} < u_j+t_j $ when $ x_j \notin \Omega = \{x \in H : \langle x-x_0, x-u \rangle \leqslant 0\} $. Hence $ x_{\phi (i)} \in \Omega $ for all $ i \geqslant i_0 $ since $ u_{\phi (i)}+t_{\phi (i)} \leqslant u_{\phi (i)+1} $. Sine $ \{x_{\phi (i)}\} $ is bounded, there exist subsequence $ \{x_{\phi(i)}\} $ of $ \{x_{\phi (i)}\} $ which converges weakly to some $ x^* \in \mathcal{H} $. As $ \Omega $ is a closed and convex set, it is then weakly closed and so $ x^* \in \Omega $. Using (3.40), one can see as in Case 1 that $ z_{\phi (i)} \rightharpoonup x^* $ and $ x^* \in EP(f, \mathcal{K}) $ contains $ u $ as its only element. We therefore have $ x^* = u $. Furthermore,

$\begin{align*} \|x_{\phi(i)} - u\|^2 & = \langle x_{\phi(i)}-V(x_i), x_{\phi(i)} - u \rangle - \langle u-V(x_i), x_{\phi(i)} - u \rangle \\ &\leqslant - \langle u-V(x_i), x_{\phi(i)} - u \rangle, \end{align*}$ $

due to $ x_{\phi (i)} \in \Omega $. This gives

$\begin{align*} \limsup\limits_{i \to \infty} \|x_{\phi(i)} - u\| \leqslant 0. \end{align*}$ $

Hence

$\begin{align} \lim\limits_{i \to \infty}\|x_{\phi(i)} - u\| = 0. \end{align}$ $

(3.41)

By definition, $ u_{\phi (i) +1} $, we have

$ u_{\phi (i)+1} = \|x_{\phi (i)+1} - u\|^2 - \theta_{\phi (i)}\|x_{\phi (i)} - u\|^2\\ + 2\theta_{\phi (i) + 1} \|x_{\phi (i) + 1} - x_{\phi (i)}\|^2 + \alpha_{\phi (i) + 1} \|x_{\phi (i) + 1} -V_{\phi (i)+1}\|^2 \\ \leqslant (\|x_{\phi (i)+1}-x_{\phi (i)}\|+\|x_{\phi (i)}-u\|)^2 - \theta_{\phi (i)}\|x_{\phi (i)}-u\|^2 \\ + 2\theta_{\phi (i) + 1} \|x_{\phi (i) + 1} - x_{\phi (i)}\|^2 \\ \quad + \alpha_{\phi (i) + 1} \|x_{\phi (i) + 1} -V_{\phi (i)+1}\|^2. $

(3.42)

By our Condition 3.2 (i), (3.39) and (3.41), we obtain $ \lim\limits_{i \to \infty} u_{\phi(i)+1} = 0 $. We next show that we actually have $ \lim\limits_{i \to \infty} u_i = 0 $. To this end, first observe that, for $ i \geqslant i_0 $, one has $ u_i+t_i \leqslant u_{\phi (i) + 1} $ if $ i \neq \phi(i) $. It follows that for all $ i \geqslant i_0 $, we have $ u_i \leqslant \max\{u_{\phi(i)}, u_{\phi (i) + 1} \} = u_{\phi (i) + 1} \rightarrow 0 $, since $ \lim\limits_{i \to \infty} t_i = 0 $, hence $ \limsup \limits_{i \to \infty} u_i \leqslant 0 $. On the other hand, Lemma 3.6 implies that $ \liminf_{i \to \infty} u_i \geqslant 0 $. Hence, we obtain $ \lim\limits_{i \to \infty} u_i = 0 $. Consequently, the boundedness of $ \{x_i\} $, $ \lim\limits_{i \to \infty} \alpha_i = 0 $, and (3.29) show that $ \|x_i - x_{i+1}\| \rightarrow 0 $, as $ i \rightarrow \infty $. Hence the definition of $ u_i $ yields $ (\|x_{i+1}-u\|^2 - \theta_i \|x_i-u\|^2) \rightarrow 0 $, as $ i \rightarrow \infty $. By using Lemma 3.7, we obtain the desired conclusion immediately.

Setting $ V(x) = x_0, \ \forall x\in \mathcal{H} $, then we obtain the following modified Halpern inertial extragradient algorithm for EPs:

Algorithm 3.9. (Modified Halpern inertial extragradient algorithm for EPs)
Initialization. Let $ x_0, x_1 \in H $, $ 0 < \lambda_i \leqslant \lambda < \frac{1}{2\max\{c_1, c_2\}} $, $ \tau \in (0, \frac{1}{2}] $.
Step 1. Given the current iterates $ x_{i-1} $ and $ x_i (i \geqslant 1) $ and $ \alpha_i \in (0, 1) $, $ \theta_i \in [0, \frac{1}{3}) $, compute
$ $\begin{align} w_i & = \alpha_i x_0 + (1-\alpha_i)x_i + \theta_i(x_i-x_{i-1}), \\ y_i & = \arg\min_{y \in \mathcal{K}} \{ \lambda_i f(w_i, y) + \frac{1}{2} \\|y-w_i\\|^2 \}. \end{align}$ $
If $ y_i = w_i $ then stop and $ y_i $ is a solution. Otherwise, go to Step 2.
Step 2. Compute
$ $\begin{align} z_i = \arg\min_{y \in \mathcal{K}} \{ \lambda_i f(y_i, y) + \frac{1}{2} \\|y-w_i\\|^2 \}. \end{align}$ $
Step 3. Compute
$ $\begin{align} x_{i+1} = (1-\tau)w_i + \tau z_i. \end{align}$ $
Set $ i = i+1 $ and return to Step 1.

| Show Table

DownLoad: CSV

From Algorithm 3.1, the convergence depends on the parameter $ \{\lambda_i\} $ with the condition $ 0 < \lambda_{i}\leq\lambda < \frac{1}{2\max\{c_1, c_2\}} $. So, the step size $ \{\lambda_i\} $ can be considered in many ways. Applying step size concept of Shehu et al. ^[22], we then obtain the following modified viscosity type inertial extragradient stepsize algorithm for EPs:

Algorithm 3.10. (Modified viscosity type inertial extragradient stepsize algorithm for EPs)
Initialization. Let $ x_0, x_1 \in H $, $ \lambda_1 \in (0, \frac{1}{2\max\{c_1, c_2\}}) $, $ \mu \in (0, 1) $, $ \tau \in (0, \frac{1}{2}] $.
Step 1. Given the current iterates $ x_{i-1} $ and $ x_i (i \geqslant 1) $ and $ \alpha_i \in (0, 1) $, $ \theta_i \in [0, \frac{1}{3}) $, compute
$ $\begin{align} w_i & = \alpha_i V(x_i) + (1-\alpha_i)x_i + \theta_i(x_i-x_{i-1}), \\ y_i & = \arg\min_{y \in \mathcal{K}} \{ \lambda_i f(w_i, y) + \frac{1}{2} \\|y-w_i\\|^2 \}. \end{align}$ $
If $ y_i = w_i $ then stop and $ y_i $ is a solution. Otherwise, go to Step 2.
Step 2. Compute
$ $\begin{align} z_i = \arg\min_{y \in \mathcal{K}} \{ \lambda_i f(y_i, y) + \frac{1}{2} \\|y-w_i\\|^2 \}. \end{align}$ $
Step 3. Compute
$ $\begin{align} x_{i+1} = (1-\tau)w_i + \tau z_i, \end{align}$ $
and
$ $\begin{align} \lambda_{i+1} = \begin{cases} \min \{ \frac{\mu}{2} \frac{\\|w_i-y_i\\|^2 + \\|z_i-y_i\\|^2}{f(w_i, z_i) - f(w_i, y_i) - f(y_i, z_i)}, \lambda_i \}, & \ {\text{if}} \ f(w_i, z_i) - f(w_i, y_i) - f(y_i, z_i) > 0, \cr \lambda_i, & \ {\text{Otherwise}}. \end{cases} \end{align}$ $
Set $ i = i+1 $ and return to Step 1.

Remark 3.11. (i) Since $ V(x) = x_0, \ \forall x\in \mathcal{H} $ is contraction, thus the modified Halpern inertial extragradient Algorithm 3.9 converges strongly to $ x^* = P_{EP(f, \mathcal{K})}x_0 $ with Conditions 2.6 and 3.2;

(ii) Since the step size $ \{\lambda_i\} $ in Algorithm 3.10 is a monotonically decreasing sequence with lower bound $ \min\{\lambda_1, \frac{1}{2\max\{c_1, c_2\}}\} $ ^[22], thus Algorithm 3.10 converges strongly to the solution $ u = P_{EP(f, \mathcal{K})} V(u) $ by Theorem 3.8.

We now give an example in infinitely dimensional spaces $ L_2[0, 1] $ to support the main theorem.

Example 3.12. Let $ V : L_2[0, 1] \rightarrow L_2[0, 1] $ be defined by $ V(x(t)) = \frac{x(t)}{2} $ where $ x(t) \in L_2[0, 1] $. We can choose $ x_0(t) = \frac{sin(t)}{2} $ and $ x_1(t) = sin(t) $. The stopping criterion is defined by $ \|x_i-x_{i-1}\| < 10^{-2} $.

We set the following parameters for each algorithm, as seen in Table 1.

Table 1. Chosen parameters of each algorithm.

	Algorithm 3.1	Algorithm 3.9	Algorithm 3.10
$ \lambda_i $	0.1	0.1	-
$ \lambda_1 $	-	-	0.12
$ \theta_i $	0.29	0.29	0.29
$ \alpha_i $	$ \frac{1}{100i+1} $	$ \frac{1}{100i+1} $	$ \frac{1}{100i+1} $
$ \tau_i $	0.15	0.1	0.15
$ \mu $	-	-	0.2

| Show Table

DownLoad: CSV

Next, we compare the performance of Algorithms 3.1, 3.9 and 3.10. We obtain the results as seen in Table 2.

Table 2. The performance of each algorithm.

	Algorithm 3.1	Algorithm 3.9	Algorithm 3.10
CPU Time	1.2626	1.2010	177.9459
Iter. No.	2	2	2

| Show Table

DownLoad: CSV

From Figure 1, we see that the performance of Algorithm 3.10 is better than Algorithms 3.1 and 3.9.

Figure 1. The Cauchy error plotting number of iterations.

DownLoad: Full-Size Img PowerPoint

4. Application to data classification problem

According to the International Diabetes Federation (IDF), there are approximately 463 million people with diabetes worldwide, and it is estimated that by 2045 there will be 629 million people with diabetes. In Thailand, the incidence of diabetes is continuously increasing. There are about 300,000 new cases per year, and 3.2 million people with diabetes are registered in the Ministry of Public Health's registration system. They are causing huge losses in health care costs. Only one disease of diabetes causes the average cost of treatment costs up to 47,596 million baht per year. This has led to an ongoing campaign about the dangers of the disease. Furthermore, diabetes mellitus makes additional noncommunicable diseases that present a high risk for the patient, as they easily contact and are susceptible to infectious diseases such as COVID-19 ^[23]. Because it is a chronic disease that cannot be cured. There is a chance that the risk of complications spreading to the extent of the loss of vital organs of the body. By the International Diabetes Federation and the World Health Organization (WHO) has designated November 14 of each year as World Diabetes Day to recognize the importance of this disease.

In this research, we used the PIMA Indians diabetes dataset which was downloaded from Kaggle (https://www.kaggle.com/uciml/pima-indians-diabetesdatabase) and is available publicly on UCI repository for training processing by our proposed algorithm. The dataset contains 768 pregnant female patients which 500 were non-diabetics and 268 were diabetics. There were 9 variables present inside the dataset; eight variables contain information about patients, and the 9th variable is the class predicting the patients as diabetic and nondiabetic. This dataset contains the various attributes that are Number of times pregnant; Plasma glucose concentration at 2 Hours in an oral glucose tolerance test (GTIT); Diastolic Blood Pressure (mm Hg); Triceps skin fold thickness (mm); 2-Hour Serum insulin (lh/ml); Body mass index [weight in kg/(Height in m)]; Diabetes pedigree function; Age (years); Binary value indicating non-diabetic /diabetic. For the implementation of machine learning algorithms, 614 were used as a training dataset and 154 were used as a testing training dataset by using 5-fold cross-validation ^[12]. For benchmarking classifier, we consider the following various methods which have been proposed to classify diabetes (see Table 3):

Table 3. Classification accuracy of different methods with literature.

Authors	Methods	Accuracy (%)
Li ^[13]	Ensemble of SVM, ANN, and NB	58.3
Brahim-Belhouari and Bermak ^[4]	NB, SVM, DT	76.30
Smith et al.^[24]	Neural ADAP algorithm	76
Quinlan ^[17]	C4.5 Decision trees	71.10
Bozkurt et al.^[3]	Artificial neural network	76.0
Sahan et al.^[20]	Artificial immune System	75.87
Smith et al.^[24]	Ensemble of MLP and NB	64.1
Chatrati et al.^[5]	Linear discriminant analysis	72
Deng and Kasabov ^[7]	Self-organizing maps	78.40
Deng and Kasabov ^[7]	Self-organizing maps	78.40
Choubey et al. ^[6]	Ensemble of RF and XB	78.9
Saxena et al. ^[21]	Feature selection of KNN, RF, DT, MLP	79.8
Our Algorithm 3.1	Extreme learning machine	80.03

| Show Table

DownLoad: CSV

We focus on extreme learning machine (ELM) proposed by Huang et al. ^[9] for applying our algorithms to solve data classification problems. It is defined as follows:

Let $ E : = \{(\mathbf{x}_n, \mathbf{t}_n) : \mathbf{x}_n\in\mathbb{R}^n, \mathbf{t}_n\in\mathbb{R}^m, n = 1, 2, ..., P\} $ be a training set of $ P $ distinct samples where $ \mathbf{x}_n $ is an input training data and $ \mathbf{t}_n $ is a training target. The output function of ELM for single-hidden layer feed forward neural networks (SLFNs) with $ M $ hidden nodes and activation function $ U $ is

$\begin{equation*} \mathbf{O}_n = \sum\limits_{j = 1}^{M}\Theta_jU(w_j, b_j, \mathbf{x}_n), \end{equation*}$ $

where $ w_j $ and $ b_j $ are parameters of weight and finally the bias, respectively and $ \Theta_j $ is the optimal output weight at the $ j $-th hidden node. The hidden layer output matrix $ H $ is defined as follows:

$ H = \left[

$\begin{array}{ccc} U(w_1, b_1, \mathbf{x}_1) & \ldots & U(w_M, b_M, \mathbf{x}_1) \\ \vdots & \ddots & \vdots \\ U(w_1, b_1, \mathbf{x}_P) & \ldots & U(w_M, b_M, \mathbf{x}_P) \end{array}$ \right] $

To solve ELM is to find optimal output weight $ \Theta = [\Theta^T_1, ..., \Theta^T_M]^T $ such that $ H \Theta = T $, where $ T = [\mathbf{t}^T_1, ..., \mathbf{t}^T_P]^T $ is the training data. In some cases, finding $ \Theta = H^{\dagger} T $, where $ H^{\dagger} $ is the Moore-Penrose generalized inverse of $ H $. However, if $ H $ does not exist, then, finding such a solution $ \Theta $ through convex minimization can overcome such difficulty.

In this section, we process some experiments on the classification problem. This problem can be seen as the following convex minimization problem:

$\begin{equation} \min\limits_{\Theta\in\mathbb{R}^M}\{\|H\Theta-T\|_2^2+\lambda\|\Theta\|_1\}, \end{equation}$ $

(4.1)

where $ \lambda $ is a regularization parameter. This problem is called the least absolute shrinkage and selection operator (LASSO) ^[26]. Setting $ f(\Theta, \zeta) = \langle H^T(H\Theta-T), \zeta-\Theta\rangle $ and $ V(x) = Cx $ where $ C $ is constant in (0, 1).

The binary cross-entropy loss function along with sigmoid activation function for binary classification calculates the loss of an example by computing the following average:

$\begin{equation} \label{acc} Loss = -\frac{1}{K} \sum\limits_{j = 1}^{K}y_j\log \hat{y}_j+(1-y_j)\log(1-\hat{y}_j)\nonumber \end{equation}$ $

where $ \hat{y}_j $ is the $ j $-th scalar value in the model output, $ y_j $ is the corresponding target value, and $ K $ is the number of scalar values in the model output.

In this work, the performance of machine learning techniques for all classes is accurately measured. The accuracy is calculated by adding the total number of correct predictions to the total number of predictions. The performance parameter calculation of precision and recall are measured. The formulation of three measures ^[27] are defined as follow:

$\begin{equation} \text{Precision(Pre)} = \frac{\mathsf{TP}}{\mathsf{TP}+\mathsf{FP}}. \end{equation}$ $

(4.2)

$\begin{equation} \text{Recall(Rec)} = \frac{\mathsf{TP}}{\mathsf{TP}+\mathsf{FN}}. \end{equation}$ $

(4.3)

$\begin{equation} \text{Accuracy(Acc)} = \frac{\mathsf{TP}+\mathsf{TN}}{\mathsf{TP}+\mathsf{FP}+\mathsf{TN}+\mathsf{FN}}\times100\%, \end{equation}$ $

(4.4)

where a confusion matrix for original and predicted classes are shown in terms of $ \mathsf{TP} $, $ \mathsf{TN} $, $ \mathsf{FP} $, and $ \mathsf{FN} $ are the True Positive, True Negative, False Positives, and False Negatives, respectively. Similarly, $ \mathsf{P} $ and $ \mathsf{N} $ are the Positive and Negative population of Malignant and Benign cases, respectively.

For starting our computation, we set the activation function as sigmoid, hidden nodes $ M = 160 $, regularization parameter $ \lambda = 1\times10^{-5} $, $ \theta_i = 0.3 $, $ \alpha_i = \frac{1}{i+1} $, $ \tau = 0.5 $, $ \mu = 0.2 $ for Algorithms 3.1, 3.9 and 3.10 and $ C = 0.9999 $ for Algorithms 3.1 and 3.10. The stopping criteria is the number of iteration 250. We obtain the results of the different parameters $ S $ when $ \lambda_i = \frac{S}{max(eigenvalue(A^{T}A))} $ for Algorithms 3.1, 3.9 and the different parameters $ \lambda_1 $ for Algorithm 3.10 as seen in Table 4.

Table 4. Training and validation loss and training time of the different parameter $ \lambda_i $ and $ \lambda_1 $.

			Loss
	$ S $, $ \lambda_1 $	Training Time	Training	Validation
	0.2	0.4164	0.286963	0.275532
	0.4	0.4337	0.283279	0.273650
Algorithm 3.1	0.6	0.4164	0.286963	0.275532
	0.9	0.4459	0.278714	0.272924
	0.99	0.4642	0.278144	0.272921
	0.2	0.4283	0.291883	0.279878
	0.4	0.5293	0.288831	0.277365
Algorithm 3.9	0.6	0.4246	0.286890	0.276099
	0.9	0.4247	0.284851	0.275079
	0.99	0.5096	0.284356	0.274879
	0.2	1.3823	0.286963	0.275532
	0.4	1.5652	0.283279	0.273650
Algorithm 3.10	0.6	1.4022	0.281060	0.273120
	0.9	1.9170	0.278714	0.272924
	0.99	1.3627	0.278144	0.272921

| Show Table

DownLoad: CSV

We can see that $ \lambda_i = \lambda_1 = 0.99 $ greatly improves the performance of Algorithms 3.1, 3.9 and 3.10. Therefore, we choose it as the default inertial parameter for next computation.

We next choose $ \lambda_i = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \alpha_i = \frac{1}{i+1} $, $ \tau = 0.5 $ for Algorithms 3.1 and 3.9 and $ C = 0.9999 $ for Algorithm 3.1 with $ \lambda_1 = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \alpha_i = \frac{1}{i+1} $, $ \tau = 0.5 $, $ C = 0.9999 $, and $ \mu = 0.2 $ for Algorithm 3.10. The stopping criteria is the number of iteration 250. We consider the different initialization parameter $ \theta $ where

$\begin{equation*} \theta_i = \left\{ \begin{array}{ll} \theta \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ if \ x_i = x_{i-1} \ and \ i\leq N, \\ \frac{\theta}{i^2\|x_i- x_{i-1}\|} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ otherwise, \\ \end{array} \right. \end{equation*}$ $

where $ N $ is a number of iterations that we want to stop. We obtain the numerical results as seen in Table 5.

Table 5. Training and validation loss and training time of the different parameter $ \theta $.

			Loss
	$ \theta $	Training Time	Training	Validation
	0.1	0.4608	0.279629	0.272965
	0.2	0.4515	0.278938	0.272931
Algorithm 3.1	0.3	0.4523	0.278144	0.272921
	$ \frac{1}{i} $	0.4591	0.280107	0.273004
	$ \frac{1}{\\|x_i-x_{i-1}\\|+i^2} $	0.5003	0.280221	0.273015
	0.1	0.4723	0.284808	0.274993
	0.2	0.4641	0.284587	0.274935
Algorithm 3.9	0.3	0.4634	0.284356	0.274879
	$ \frac{1}{i} $	0.5297	0.285004	0.275049
	$ \frac{1}{\\|x_i-x_{i-1}\\|+i^2} $	0.4825	0.285019	0.275053
	0.1	1.4071	0.279629	0.272965
	0.2	1.3505	0.278938	0.272931
Algorithm 3.10	0.3	1.4819	0.278144	0.272921
	$ \frac{1}{i} $	1.3276	0.280107	0.273004
	$ \frac{1}{\\|x_i-x_{i-1}\\|+i^2} $	1.4228	0.280221	0.273015

| Show Table

DownLoad: CSV

We can see that $ \theta = 0.3 $ greatly improves the performance of Algorithms 3.1, 3.9 and 3.10. Therefore, we choose it as the default inertial parameter for next computation.

We next set $ \lambda_i = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ \tau = 0.5 $ for Algorithms 3.1 and 3.9 and $ C = 0.9999 $ for Algorithm 3.1 with $ \lambda_1 = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ \tau = 0.5 $, $ C = 0.9999 $, and $ \mu = 0.2 $ for Algorithm 3.10. The stopping criteria is the number of iteration 250. We consider the different initialization parameter $ \alpha_i $. The numerical results are shown in Table 6.

Table 6. Training and validation loss and training time of the different parameter $ \alpha_i $.

			Loss
	$ \alpha_i $	Training Time	Training	Validation
	$ \frac{1}{i+1} $	0.4407	0.278144	0.272921
Algorithm 3.1	$ \frac{1}{10i+1} $	0.4054	0.278143	0.272921
	$ \frac{1}{i^2+1} $	0.4938	0.278143	0.272921
	$ \frac{1}{10i^2+1} $	0.4876	0.278143	0.272921
	$ \frac{1}{i+1} $	0.4163	0.284356	0.274879
Algorithm 3.9	$ \frac{1}{10i+1} $	0.4274	0.279201	0.273129
	$ \frac{1}{i^2+1} $	0.5150	0.278294	0.272931
	$ \frac{1}{10i^2+1} $	0.5960	0.278160	0.272922
	$ \frac{1}{i+1} $	1.4292	0.278144	0.272921
Algorithm 3.10	$ \frac{1}{10i+1} $	1.3803	0.278143	0.272921
	$ \frac{1}{i^2+1} $	1.2452	0.278143	0.272921
	$ \frac{1}{10i^2+1} $	1.4100	0.278143	0.272921

| Show Table

DownLoad: CSV

We can see that $ \alpha_i = \frac{1}{10i+1} $ greatly improves the performance of Algorithm 3.1, $ \alpha_i = \frac{1}{10i^2+1} $ greatly improves the performance of Algorithm 3.9, and $ \alpha_i = \frac{1}{i^2+1} $ greatly improves the performance of Algorithm 3.10. Therefore, we choose it as the default inertial parameter for next computation.

We next calculate the numerical results by setting $ \lambda_i = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ \alpha_i = \frac{1}{10i+1} $ and $ C = 0.9999 $ for Algorithm 3.1, $ \lambda_i = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ \alpha_i = \frac{1}{10i^2+1} $ for Algorithm 3.9 and $ \lambda_1 = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ C = 0.9999 $, $ \alpha_i = \frac{1}{i^2+1} $, and $ \mu = 0.2 $ for Algorithm 3.10. The stopping criteria is the number of iteration 250. We consider the different initialization parameter $ \tau $. The numerical results are shown in Table 7.

Table 7. Training and validation loss and training time of the different parameter $ \tau $.

			Loss
	$ \tau $	Training Time	Training	Validation
	0.1	0.4278	0.300531	0.299144
Algorithm 3.1	0.3	0.4509	0.299074	0.293717
	0.5	0.5239	0.278143	0.272921
	$ \frac{i}{2i+1} $	0.4708	0.282187	0.274017
	0.1	0.4592	0.300531	0.299144
Algorithm 3.9	0.3	0.4900	0.299074	0.293717
	0.5	0.4261	0.278160	0.272922
	$ \frac{i}{2i+1} $	0.5224	0.282191	0.274018
	0.1	1.3401	0.300531	0.299144
Algorithm 3.10	0.3	1.3771	0.299074	0.293717
	0.5	1.8681	0.278143	0.272921
	$ \frac{i}{2i+1} $	1.4671	0.282187	0.274017

| Show Table

DownLoad: CSV

We can see that $ \tau = 0.5 $ greatly improves the performance of Algorithms 3.1, 3.9 and 3.10. Therefore, we choose it as the default inertial parameter for next computation.

We next calculate the numerical results by setting $ \lambda_i = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ \tau = 0.5 $ for Algorithms 3.1 and 3.9 and $ \alpha_i = \frac{1}{10i+1} $ for Algorithm 3.1 with $ \alpha_i = \frac{1}{10i^2+1} $ for Algorithm 3.9 and $ \lambda_1 = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ \alpha_i = \frac{1}{i^2+1} $, $ \tau = 0.5 $, and $ \mu = 0.2 $ for Algorithm 3.10. The stopping criteria is the number of iteration 250. We obtain the results of the different parameters $ C $ when $ V(x) = Cx $ for Algorithms 3.1 and 3.10 as seen in Table 8.

Table 8. Training and validation loss and training time of the different parameter $ c $.

			Loss
	$ C $	Training Time	Training	Validation
	0.3	0.4796	0.278902	0.273066
	0.5	0.4270	0.278695	0.273024
Algorithm 3.1	0.7	0.4190	0.278480	0.272982
	0.9	0.4209	0.278257	0.272941
	0.9999	0.4844	0.278143	0.272921
	0.3	1.5886	0.278251	0.272928
	0.5	1.6358	0.278222	0.272926
Algorithm 3.10	0.7	1.3808	0.278191	0.272924
	0.9	1.5176	0.278159	0.272922
	0.9999	1.4598	0.278143	0.272921

| Show Table

DownLoad: CSV

From Tables 3–8, we choose the parameters for Algorithm 3.1 to compare the exist algorithms from the literature. The following Table 9 shows choosing the necessary parameters of each algorithm.

Table 9. Chosen parameters of each algorithm.

	Algorithm in (1.2)	Algorithm in (1.4)	Algorithm in (1.6)	Algorithm 3.1	Algorithm 3.9	Algorithm 3.10
$ \mu $	-	0.3	0.3	-	-	0.2
$ \lambda_1 $	-	$ \frac{0.5}{max(eig(A^{T}A))} $	$ \frac{0.9999}{max(eig(A^{T}A))} $	-	-	$ \frac{0.99}{max(eig(A^{T}A))} $
$ \lambda_i $	$ \frac{0.5}{max(eig(A^{T}A))} $	-	-	$ \frac{0.99}{max(eig(A^{T}A))} $	$ \frac{0.99}{max(eig(A^{T}A))} $	-
$ \theta_i $	-	-	0.3	0.3	0.3	0.3
$ \alpha_i $	$ \frac{1}{100i+1} $	$ \frac{1}{100i+1} $	$ \frac{1}{2i+1} $	$ \frac{1}{10i+1} $	$ \frac{1}{10i^2+1} $	$ \frac{1}{i^2+1} $
$ \tau $	-	-	0.5	0.5	0.5	0.5
$ c $	-	-	-	0.9999	-	0.9999

| Show Table

DownLoad: CSV

For comparison, We set sigmoid as an activation function, hidden nodes $ M = 160 $ and regularization parameter $ \lambda = 1 \times 10^{-5} $.

Table 10 shows that Algorithm 3.1 has the highest efficiency in precision, recall, and accuracy. It also has the lowest number of iterations. It has the highest probability of correctly classifying tumors compared to algorithms examinations. We present the training and validation loss with the accuracy of training to show that Algorithm 3.1 has no overfitting in the training dataset.

Table 10. The performance of each algorithm.

Algorithm	Iteration No.	Training Time	Pre	Rec	Acc $ (\%) $
Algorithm in (1.2)	25	0.0537	80.97	97.50	80.03
Algorithm in (1.4)	25	0.3132	80.97	97.50	80.03
Algorithm in (1.6)	30	0.1182	80.97	97.50	80.03
Algorithm 3.1	18	0.0375	80.97	97.50	80.03
Algorithm 3.9	18	0.0401	80.97	97.50	80.03
Algorithm 3.10	18	0.1045	80.97	97.50	80.03

| Show Table

DownLoad: CSV

From Figures 2 and 3, we see that our Algorithm 3.1 has good fit model this means that our Algorithm 3.1 suitably learns the training dataset and generalizes well to classification the PIMA Indians diabetes dataset.

Figure 2. Training and validation loss plots of Algorithm 3.1.

DownLoad: Full-Size Img PowerPoint

Figure 3. Training and validation accuracy plots of Algorithm 3.1.

DownLoad: Full-Size Img PowerPoint

5. Conclusions

In general, screening for diabetes in pregnancy we use The American College of Obstetricians and Gynecologists (ACOG) recommendations. The accuracy of our method is $ 80.03\% $ and high accuracy may use for predict correctly diabetes in pregnancy in the future.

In this paper, we introduce a modified extragradient method with inertial extrapolation step and viscosity-type method to solve equilibrium problems of pseudomonotone bifunction operator in real Hilbert spaces. We then prove the strong convergence theorem of the proposed algorithm under the assumption that the bifunction satisfies the Lipchitz-type condition. Moreover, we show choosing stepsize parameter $ \{\lambda_i\} $ in many ways, this shows that our algorithm is flexible using, see in Algorithms 3.1 and 3.10. Finally, we show our algorithms are better performance than existing algorithms to solve the diabetes mellitus classification in machine learning.

Acknowledgments

This research was also supported by Fundamental Fund 2022, Chiang Mai University and the NSRF via the Program Management Unit for Human Resources and Institutional Development, Research and Innovation (grant number B05F640183). W. Cholamjiak would like to thank National Research Council of Thailand (N42A650334) and Thailand Science Research and Innovation, the University of Phayao (FF65-UOE).

Conflict of interest

The authors declare no conflict of interest.

References

[1]	N. Ansini, Gradient flows with wiggly potential: a variational approach to the dynamics, in Mathematical Analysis of Continuum Mechanics and Industrial Applications Ⅱ, CoMFoS16 (Springer), 30 (2017), 139-151. doi: 10.1007/978-981-10-6283-4_12
[2]	N. Ansini, A. Braides and J. Zimmer, Minimising movements for oscillating energies: The critical regime, Proceedings of the Royal Society of Edinburgh Section A (in press), 2016, arXiv: 1605.01885.
[3]	Curvature-driven flows: a variational approach. SIAM Journal of Control and Optimization (1993) 31: 387-438.
[4]	L. Ambrosio, N. Gigli and G. Savaré, Gradient Flows in Metric Spaces and in the Space of Probability Measures, Birkhauser Verlag, second edition, 2008.
[5]	A. Braides, Γ-convergence for Beginners, Oxford University Press, 2002. doi: 10.1093/acprof:oso/9780198507840.001.0001
[6]	A. Braides, Local Minimization, Variational Evolution and Γ-convergence, Springer Cham, 2014. doi: 10.1007/978-3-319-01982-6
[7]	Minimizing movements along a sequence of functionals and curves of maximal slope. Comptes Rendus Matematique (2005) 354: 685-689.
[8]	M. Colombo and M. Gobbino, Passing to the limit in maximal slope curves: from a regularized Perona-Malik to the total variation flow, Mathematical Models and Methods in Applied Sciences, 22 (2012), 1250017, 19pp. doi: 10.1142/S0218202512500170
[9]	L. C. Evans, Partial Differential Equations, American Mathematical Society, second edition, 2010. doi: 10.1090/gsm/019
[10]	F. Fleissner, Γ-convergence and relaxations for gradient flows in metric spaces: A minimizing movement approach, ESAIM Control, Optimisation and Calculus of Variations (to appear), preprint, arXiv: 1603.02822, (2016).
[11]	F. Fleissner and G. Savaré, Reverse approximation of gradient flows as minimizing movements: A conjecture by De Giorgi, Forthcoming Articles, (2018), 30pp, arXiv: 1711.07256. doi: 10.2422/2036-2145.201711_008
[12]	The variational formulation of the Fokker-Plank equation. SIAM Journal of Mathematical Analysis (1998) 29: 1-17.
[13]	Gamma-convergence of gradient flows and application to Ginzburg-Landau. Communications on Pure and Applied Mathematics (2004) 57: 1627-1672.

This article has been cited by:

Godwin Chidi Ugwunnadi, Abdul Rahim Khan, Austine Efut Ofem, Convergence results for Górnicki type contractive mapping in CAT(0) spaces, 2024, 0971-3611, 10.1007/s41478-024-00864-8

Reader Comments

Your name:*

Email:*
© 2018 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)