Processing math: 100%

Perturbations of minimizing movements and curves of maximal slope

  • Received: 01 October 2017 Revised: 01 April 2018
  • 47J35, 47J30, 35B27, 35K90, 49J05

  • We modify the De Giorgi's minimizing movements scheme for a functional $φ$, by perturbing the dissipation term, and find a condition on the perturbations which ensures the convergence of the scheme to an absolutely continuous perturbed minimizing movement. The perturbations produce a variation of the metric derivative of the minimizing movement. This process is formalized by the introduction of the notion of curve of maximal slope for $φ$ with a given rate. We show that if we relax the condition on the perturbations we may have many different meaningful effects; in particular, some perturbed minimizing movements may explore different potential wells.

    Citation: Antonio Tribuzio. Perturbations of minimizing movements and curves of maximal slope[J]. Networks and Heterogeneous Media, 2018, 13(3): 423-448. doi: 10.3934/nhm.2018019

    Related Papers:

    [1] Saif Ur Rehman, Iqra Shamas, Shamoona Jabeen, Hassen Aydi, Manuel De La Sen . A novel approach of multi-valued contraction results on cone metric spaces with an application. AIMS Mathematics, 2023, 8(5): 12540-12558. doi: 10.3934/math.2023630
    [2] Dong Ji, Yao Yu, Chaobo Li . Fixed point and endpoint theorems of multivalued mappings in convex $ b $-metric spaces with an application. AIMS Mathematics, 2024, 9(3): 7589-7609. doi: 10.3934/math.2024368
    [3] Yan Han, Shaoyuan Xu, Jin Chen, Huijuan Yang . Fixed point theorems for $ b $-generalized contractive mappings with weak continuity conditions. AIMS Mathematics, 2024, 9(6): 15024-15039. doi: 10.3934/math.2024728
    [4] Shaoyuan Xu, Yan Han, Suzana Aleksić, Stojan Radenović . Fixed point results for nonlinear contractions of Perov type in abstract metric spaces with applications. AIMS Mathematics, 2022, 7(8): 14895-14921. doi: 10.3934/math.2022817
    [5] Mohamed Gamal, Tahair Rasham, Watcharaporn Cholamjiak, Fu-Gui Shi, Choonkil Park . New iterative scheme for fixed point results of weakly compatible maps in multiplicative $ {G_{\boldsymbol{M}}}- $metric space via various contractions with application. AIMS Mathematics, 2022, 7(8): 13681-13703. doi: 10.3934/math.2022754
    [6] Abdullah Shoaib, Tahair Rasham, Giuseppe Marino, Jung Rye Lee, Choonkil Park . Fixed point results for dominated mappings in rectangular b-metric spaces with applications. AIMS Mathematics, 2020, 5(5): 5221-5229. doi: 10.3934/math.2020335
    [7] Pragati Gautam, Vishnu Narayan Mishra, Rifaqat Ali, Swapnil Verma . Interpolative Chatterjea and cyclic Chatterjea contraction on quasi-partial $b$-metric space. AIMS Mathematics, 2021, 6(2): 1727-1742. doi: 10.3934/math.2021103
    [8] Qing Yang, Chuanzhi Bai . Fixed point theorem for orthogonal contraction of Hardy-Rogers-type mapping on $O$-complete metric spaces. AIMS Mathematics, 2020, 5(6): 5734-5742. doi: 10.3934/math.2020368
    [9] Muhammad Nazam, Aftab Hussain, Asim Asiri . On a common fixed point theorem in vector-valued $ b $-metric spaces: Its consequences and application. AIMS Mathematics, 2023, 8(11): 26021-26044. doi: 10.3934/math.20231326
    [10] Mohammed Shehu Shagari, Akbar Azam . Integral type contractions of soft set-valued maps with application to neutral differential equations. AIMS Mathematics, 2020, 5(1): 342-358. doi: 10.3934/math.2020023
  • We modify the De Giorgi's minimizing movements scheme for a functional $φ$, by perturbing the dissipation term, and find a condition on the perturbations which ensures the convergence of the scheme to an absolutely continuous perturbed minimizing movement. The perturbations produce a variation of the metric derivative of the minimizing movement. This process is formalized by the introduction of the notion of curve of maximal slope for $φ$ with a given rate. We show that if we relax the condition on the perturbations we may have many different meaningful effects; in particular, some perturbed minimizing movements may explore different potential wells.



    Let $ \mathcal{H} $ be a real Hilbert space and $ \mathcal{K} $ is a nonempty closed and convex subset of a real Hilbert space $ \mathcal{H} $. The equilibrium problem (EP) is to find an element $ u^* \in \mathcal{K} $ such that

    $ f(u,v)0,vK $ (1.1)

    where $ f : \mathcal{K} \times \mathcal{K} \rightarrow \mathbb{R} $ is a bifunction and satisfying $ f(z, z) = 0 $ for all $ z \in \mathcal{H} $, and $ EP(f, \mathcal{K}) $ is denoted for a solution set of EP (1.1). EP (1.1) generalizes many various problems in optimization problems such as variational inequalities problems, Nash equilibrium problems, linear programming problems, among others.

    To solve the problem EP (1.1), the two-step extragradient method (TSEM) was proposed by Tran et al. [25] in 2008 which is inspired by the concept ideas to solve variational inequalities of Korpelevich [10]. Under the Lipschitz condition of the bifunction $ f $ the convergence theorem was proved. However, only weak convergence was obtained in Hilbert spaces. For obtaining strong convergence, this goal was completed by Hieu [8] with Halpern subgradient extragradient method (HSEM) which was modified from the HSEM of Kraikaew and Saejung [11] for variational inequalities. This method is defined by $ u\in \mathcal{H} $ and

    $ {yi=argminyK{λf(xi,y)+12xiy2},Ti={vH:(xiλti)yi,vyi0}, ti2f(xi,yi),zi=argminyTi{λf(yi,y)+12xiy2},xi+1=αiu+(1αi)zi, $ (1.2)

    where $ \lambda $ is still some constant depending on the interval that makes the bifunction $ f $ satisfies the Lipschitz condition and $ \{\alpha_i\}\subset (0, 1) $ which satisfies the principle conditions

    $ limiαi=0,  i=1αi=. $ (1.3)

    Very recently, Muangchoo [14] combined a viscosity-type method with the extragradient algorithm for obtaining strong convergence theorem of the EP (1.1). This method is defined by

    $ {vi=argminvK{λif(zi,v)+12ziv2},εi={tH:ziλiωivi,tvi0}, ωi2f(zi,vi),ti=argminvεi{μλif(vi,v)+12ziv2},zi+1=αig(zi)+(1αi)ti, $ (1.4)

    where $ \mu\in\left(0, \sigma\right)\subset\left(0, \min\left\lbrace1, \frac{1}{2c_{1}}, \frac{1}{2c_{2}}\right\rbrace\right) $, $ g $ is a contraction function on $ \mathcal{H} $ with contraction constant $ \xi\in[0, 1) $($ \|g(x)-g(y)\|\leq\xi\|x-y\|, \ \forall x, y\in \mathcal{H} $), $ \{\alpha_i\} $ satisfies the principle conditions $ \lim\limits_{i \to \infty} \alpha_i = 0 $ and $ \sum_{i}^{+\infty} \alpha_i = +\infty $, and the step-sizes $ \{\lambda_i\} $ is developed by updating the step-sizes method without knowing the Lipschitz-type constants of the bifunction $ f $ which satisfies the following:

    $ λi+1={min{σ,μf(vi,ti)Ki},                     if μf(vi,ti)Ki>0,λ0,                                       otherwise, $ (1.5)

    where $ K_i = f(z_i, t_i)-f(z_i, v_i)-c_{1}\|z_{i}-v_{i}\|^{2}-c_{2}\|t_{i}-v_{i}\|^{2}+1 $.

    Finding technique to speed up the convergence of the algorithm is the way that many mathematicians are interested. One of that is an inertial which was first introduced by Polyak [16]. Very recently, Shehu et al. [22] modified the inertial technique with the Halpern-type algorithm and subgradient extragradient method for obtaining strong convergence to a solution of $ EP(f, \mathcal{K}) $ such that $ f $ is pseudomonotone. This method is defined by $ \gamma \in \mathcal{H} $ and

    $ {wi=αiγ+(1αi)zi+δi(zizi1),vi=argminvK{λif(wi,v)+12wiv2},εi={tH:(wiλiωi)vi,tvi},  ωi2f(wi,vi),ti=argminvεi{λf(vi,v)+12wiv2},zi+1=τwi+(1τ)ti, $ (1.6)

    where the inertial parameter $ \delta_i\in[0, \frac{1}{3}) $, $ \tau\in(0, \frac{1}{2}] $, the update step-size $ \{\lambda_i\} $ satisfies the following:

    $ λi+1={min{μ2wivi2+tivi2f(wi,ti)f(wi,vi)f(vi,ti),λi},         if f(wi,ti)f(wi,vi)f(vi,ti)>0,λi,                                               otherwise, $ (1.7)

    and $ \{\alpha_i\} $ still satisfies the principle conditions $ \lim\limits_{i \to \infty} \alpha_i = 0 $ and $ \sum_{i}^{\infty} \alpha_i = \infty $. This update step-size $ \{\lambda_i\} $ is limited in the computation and can not be modified in another way.

    In this paper, motivated and inspired by the above works in literature, we introduce a modified inertial viscosity extragradient type method for solving equilibrium problems. Moreover, we apply our main result to solve a data classification problem in machine learning and show the performance of our algorithm by comparing with many existing methods.

    Let us begin with some concepts of monotonicity of a bifunction [2,15]. Let $ \mathcal{K} $ be a nonempty, closed and convex subset of $ \mathcal{H} $. A bifunction $ f : \mathcal{H} \times \mathcal{H} \rightarrow \mathbb{R} $ is said to be:

    (i) strongly monotone on $ \mathcal{K} $ if there exists a constant $ \gamma > 0 $ such that

    $ f(x,y)+f(y,x)γxy2, x,yK; $

    (ii) mototone on $ \mathcal{K} $ if $ f(x, y) \leqslant - f(y, x), \forall x, y \in \mathcal{K}; $

    (iii) pseudomonotone on $ \mathcal{K} $ if $ f(x, y) \geqslant 0 \Longrightarrow f(y, x) \leqslant 0, \forall x, y \in \mathcal{K}; $

    (iv) satisfying Lipschitz-type condition on $ \mathcal{K} $ if there exist two positive constants $ c_1, c_2 $ such that

    $ c1xy2+c2yz2f(x,z)f(x,y)f(y,z),x,y,zK. $

    The normal cone $ N_\mathcal{K} $ to $ \mathcal{K} $ at a point $ x \in \mathcal{K} $ is defined by $ N_\mathcal{K} (x) = \{ w \in \mathcal{H} : \langle w, x-y \rangle \geqslant 0, \forall y \in \mathcal{K} \} $. For every $ x \in \mathcal{H} $, the metric projection $ P_\mathcal{K} x $ of $ x $ onto $ \mathcal{K} $ is the nearest point of $ x $ in $ \mathcal{K} $, that is, $ P_\mathcal{K} x = \arg\min\{\|y-x\| : y \in \mathcal{K} \} $. Since $ \mathcal{K} $ is nonempty, closed and convex, $ P_\mathcal{K} x $ exists and is unique. For each $ x, z \in \mathcal{H} $, by $ \partial_2 f(z, x) $, we denote the subdifferential of convex function $ f(z, .) $ at $ x $, i.e.,

    $ 2f(z,x)={uH:f(z,y)u,yx+f(z,x), yH}. $

    In particular, for $ z \in \mathcal{K} $,

    $ 2f(z,z)={uH:f(z,y)u,yz, yH}. $

    For proving the convergence of the proposed algorithm, we need the following lemmas.

    Lemma 2.1. [1] For each $ x \in \mathcal{H} $ and $ \lambda > 0 $,

    $ 1λxproxλg(x),yproxλg(x)g(y)g(proxλg(x)), yK, $

    where $ prox_{\lambda g} (x) = \arg\min\{ \lambda g(y) + \frac{1}{2} \|x-y\|^2 : y \in \mathcal{K} \} $.

    For any point $ u \in \mathcal{H} $, there exists point $ P_\mathcal{K} u \in \mathcal{K} $ such that

    $ uPKuuy, yK. $

    $ P_\mathcal{K} $ is called the metric projection of $ \mathcal{H} $ onto $ \mathcal{K} $. We know that $ P_\mathcal{K} $ is a nonexpansive mapping of $ \mathcal{H} $ onto $ \mathcal{K} $, that is, $ \|P_\mathcal{K}x-P_\mathcal{K}y\|\leq \|x-y\|, \ \forall x, y \in \mathcal{K} $.

    Lemma 2.2. [18] Let $ g : \mathcal{K} \rightarrow \mathbb{R} $ be a convex, subdifferentiable and lower semicontinuous function on $ \mathcal{K} $. Suppose the convex set $ \mathcal{K} $ has nonempty interior, or $ g $ is continuous at a point $ x^* \in \mathcal{K} $. Then, $ x^* $ is a solution to the following convex problem $ \min\{g(x) : x \in \mathcal{K}\} $ if and only if $ 0 \in \partial g(x^*) + N_\mathcal{K} (x^*) $, where $ \partial g(.) $ denotes the subdifferential of $ g $ and $ N_\mathcal{K} (x^*) $ is the normal cone of $ \mathcal{K} $ at $ x^* $.

    Lemma 2.3. [19,22] Let $ S \subseteq \mathbb{R} $ be a nonempty, closed, and convex subset of a real Hilbert space $ \mathcal{H} $. Let $ u \in \mathcal{H} $ be arbitrarity given, $ z = P_S u $, and $ \Omega = \{ x \in \mathcal{H} : \langle x-u, x-z \rangle \leqslant 0 \} $. Then $ \Omega \cap S = \{z\} $.

    Lemma 2.4. [28] Let $ \{a_i\} $ and $ \{c_i\} $ be nonnegative sequences of real numbers such that $ \sum_{i = 1}^{\infty} c_i < \infty $, and let $ \{b_i\} $ be a sequence of real numbers such that $ \limsup \limits_{i \to \infty} b_i \leqslant 0 $. If for any $ i \in N $ such that

    $ ai+1(1γi)ai+γibi+ci, $

    where $ \{\gamma_i\} $ is a sequence in $ (0, 1) $ such that $ \sum_{i = 1}^{\infty} \gamma_i = \infty $, then $ \lim\limits_{i \to \infty} a_i = 0 $.

    Lemma 2.5. [29] Let $ \{a_i\} $, $ \{b_i\} $ and $ \{c_i\} $ be positive sequences, such that

    $ ai+1(1+ci)ai+bi, i1. $

    If $ \sum_{i = 1}^{ \infty} c_i < +\infty $ and $ \sum_{i = 1}^{\infty} b_i < +\infty $; then, $ \lim\nolimits_{i \to +\infty} a_i $ exists.

    The convergence of Algorithm 3.1 will be given under the conditions that

    Condition 2.6. (A1) $ f $ is pseudomonotone on $ \mathcal{K} $ with $ int(\mathcal{K}) \neq \emptyset $ or $ f(x, .) $ is continuous at some $ z \in \mathcal{K} $ for every $ x \in \mathcal{K} $;

    (A2) $ f $ satisfies Lipschitz-type condition on $ \mathcal{H} $ with two constants $ c_1 $ and $ c_2 $;

    (A3) $ f(., y) $ is sequentially weakly upper semicontinuous on $ \mathcal{K} $ for each fixed point $ y \in \mathcal{K} $, i.e., if $ \{x_i\} \subset \mathcal{K} $ is a sequence converging weakly to $ x \in \mathcal{K} $, then $ f(x, y) \geqslant \limsup \limits_{i \to \infty} f(x_i, y) $;

    (A4) for $ x \in \mathcal{H} $, $ f(x, .) $ is convex and lower semicontinuous, subdifferentiable on $ \mathcal{H} $;

    (A5) $ V : \mathcal{H} \rightarrow \mathcal{H} $ is contraction with contraction constant $ \alpha $.

    Now, we are in a position to present a modification of algorithm (EGM) in [25] for equilibrium problems.

    Algorithm 3.1. (Modified viscosity type inertial extragradient algorithm for EPs)
    Initialization. Let $ x_0, x_1 \in H $, $ 0 < \lambda_i \leqslant \lambda < \frac{1}{2\max\{c_1, c_2\}} $, $ \tau \in (0, \frac{1}{2}] $.
    Step 1. Given the current iterates $ x_{i-1} $ and $ x_i (i \geqslant 1) $ and $ \alpha_i \in (0, 1) $, $ \theta_i \in [0, \frac{1}{3}) $, compute
    $ wi=αiV(xi)+(1αi)xi+θi(xixi1),yi=argminyK{λif(wi,y)+12ywi2}. $
    If $ y_i = w_i $ then stop and $ y_i $ is a solution. Otherwise, go to Step 2.
    Step 2. Compute
    $ zi=argminyK{λif(yi,y)+12ywi2}. $
    Step 3. Compute
    $ xi+1=(1τ)wi+τzi. $
    Set $ i = i+1 $ and return to Step 1.

    In this section, we will analyse the convergence of Algorithm 3.1.

    For the rest of this paper, we assume the following condition.

    Condition 3.2. (i) $ \{\alpha_i\} \subset (0, 1] $ is non-increasing with $ \lim\limits_{i \to \infty}{\alpha_i} = 0 $ and $ \sum\limits_{i = 1}^{\infty}{\alpha_i} = \infty $;

    (ii) $ 0 \leq \theta_{i} \leq \theta_{i+1} \leq \theta < \frac{1}{3} $ and $ \lim\limits_{i \to \infty} \frac{\theta_i}{\alpha_i} \|x_i - x_{i-1} \| = 0 $;

    (iii) $ EP(f, \mathcal{K}) \not = \emptyset $.

    Before we prove the strong convergence result, we need some lemmas below.

    Lemma 3.3. Assume that Conditions 2.6 and 3.2 hold. Let $ \{x_{i}\} $ be generated by Algorithm 3.1. Then there exists $ N > 0 $ such that

    $ xi+1u2wiu2xi+1wi2, uEP(f,K), iN. $ (3.1)

    Proof. By the definition of $ y_i $, and Lemma 2.1, we have

    $ 1λiwiyi,yyif(wi,y)f(wi,yi), yK. $ (3.2)

    Putting $ y = z_{i} $ into (3.2), we obtain

    $ 1λiyiwi,yizif(wi,zi)f(wi,yi). $ (3.3)

    By the definition of $ z_i $, we have

    $ 1λiwizi,yzif(yi,y)f(yi,zi), yK. $ (3.4)

    (3.3) and (3.4) imply that

    $ 2λiwizi,yzi+2λiyiwi,yizi2f(yi,y)+2(f(wi,zi)f(wi,yi)f(yi,zi)). $ (3.5)

    If $ f(w_i, z_i) - f(w_i, y_i) - f(y_i, z_i) > 0 $, then

    $ f(wi,zi)f(wi,yi)f(yi,zi)c1wiyi2+c2ziyi2 $ (3.6)

    Observe that (3.6) is also satisfied if $ f(w_i, z_i) - f(w_i, y_i) - f(y_i, z_i) \leq 0 $. By (3.5) and (3.6), we have

    $ wizi,yzi+yiwi,yiziλif(yi,y)+λic1wiyi2+λic2ziyi2. $ (3.7)

    Note that

    $ wizi,ziy=12(wiy2wizi2ziy2). $ (3.8)

    and

    $ wiyi,ziyi=12(wiyi2+ziyi2wizi2). $ (3.9)

    Using (3.8) and (3.9) in (3.7), we obtain, for all $ y \in \mathcal{K} $,

    $ ziy2wiy2(12λic1)wiyi2(12λic2)ziyi2+2λif(yi,y). $ (3.10)

    Taking $ y = u \in EP(f, \mathcal{K}) \subset \mathcal{K} $, one has $ f(u, y_i) \geqslant 0, \forall i $. By (A1), we obtain $ f(y_i, u) \leqslant 0, \ \forall i $. Hence, we obtain from (3.10) that

    $ ziu2wiu2(12λic1)wiyi2(12λic2)ziyi2. $ (3.11)

    It follows from $ \lambda_i \in (0, \frac{1}{2\max\{c_1, c_2\}}) $ and (3.11), we have

    $ ziuwiu. $ (3.12)

    On the other hand, we have

    $ xi+1u2=(1τ)wiu2+τziu2(1τ)τziwi2. $ (3.13)

    Substituting (3.11) into (3.13), we obtain

    $ xi+1u2wiu2τwiu2+τwiu2τ(12λic1)wiyi2τ(12λic2)ziyi2(1τ)τziwi2. $ (3.14)

    Moreover, we have $ z_i-w_i = \frac{1}{\tau}(x_{i+1}-w_i) $, which together with (3.14) gives

    $ xi+1u2wiu2τ(12λic1)wiyi2τ(12λic2)ziyi2(1τ)τ1τ2xi+1wi2wiu21ττxi+1wi2wiu2ϵxi+1wi2, iN, $ (3.15)

    where $ \epsilon = \frac{1-\tau}{\tau} $.

    Lemma 3.4. Assume that Conditions 2.6 and 3.2 hold. Let $ \{x_i\} $ be generated by Algorithm 3.1. Then, for all $ u \in EP(f, \mathcal{K}) $,

    $ 2αixiu,xiV(xi)xi+1u2xiu2+2θi+1xi+1xi22θixixi12+αi+1V(xi)xi+12αixiV(xi)2θixiu2+θi1xi1u2+(13θi+1αi)xixi+12. $ (3.16)

    Proof. By Lemma 2.5, we have

    $ xi+1u2wiu2xi+1wi2. $ (3.17)

    Moreover, from the definition of $ w_i $, we obtain that

    $ wiu2=xiu2+θi(xixi1)αi(xiV(xi))2+2xiu,θi(xixi1)αi(xiV(xi))=xiu2+θi(xixi1)αi(xiV(xi))2+2θixiu,xixi12αixiu,xiV(xi). $ (3.18)

    Replacing $ u $ by $ x_{i+1} $ in (3.18), we have

    $ wixi+12=xixi+12+αi(xiV(xi))θi(xixi1)2+2θixixi+1,xixi12αixixi+1,xiV(xi). $ (3.19)

    Substituting (3.18) and (3.19) into (3.17), we have

    $ xi+1u2xiu2+θi(xixi1)αi(xiV(xi))2+2θixiu,xixi12αixiu,xiV(xi)xixi+122θixixi+1,xixi1+2αixixi+1,xiV(xi)αi(xiV(xi))θi(xixi1)2=xiu2xixi+12+2θixiu,xixi12αixiu,xiV(xi)+2αixixi+1,xiV(xi)+θixixi+12+θixixi12θixixi+1+(xixi1)2. $ (3.20)

    Therefore, we obtain

    $ xi+1u2xiu2θixixi12+xixi+12θixixi+122θixiu,xixi12αixiu,xiV(xi)+2αixixi+1,xiV(xi)=2αixiu,xiV(xi)θixi1u2+θixiu2+θixixi12αiV(xi)xi+12+αixi+1xi2+αixiV(xi)2. $ (3.21)

    It follows that

    $ -2\alpha_i \langle x_i-u, x_i-V(x_i) \rangle \geqslant \|x_{i+1}-u\|^2 - \|x_i-u\|^2 - \theta_i\|x_i-x_{i-1}\|^2 + \|x_i-x_{i+1}\|^2 \\ \quad -\theta_i\|x_i-x_{i+1}\|^2 + \theta_i\|x_{i-1}-u\|^2 - \theta_i\|x_i-u\|^2 - \theta_i\|x_i-x_{i-1}\|^2 \\ \quad +\alpha_i\|V(x_i)-x_{i+1}\|^2 - \alpha_i\|x_{i+1}-x_i\|^2 - \alpha_i\|x_i-V(x_i)\|^2 \\ \geqslant \|x_{i+1}-u\|^2 - \|x_i-u\|^2 + 2\theta_{i+1}\|x_{i+1}-x_i\|^2 - 2\theta_i\|x_i-x_{i-1}\|^2 \\ \quad +\theta_i(\|x_{i-1}-u\|^2 - \|x_i-u\|^2) + \alpha_i(\|V(x_i)-x_{i+1}\|^2 - \|x_i-V(x_i)\|^2) \\ \quad +(1-\theta_i-2\theta_{i+1}-\alpha_i)\|x_{i+1}-x_i\|^2. $ (3.22)

    Since $ {\theta_i} $ is non-decreasing and $ \alpha_i $ is non-increasing, we then obtain

    $ 2αixiu,xiV(xi)xi+1u2xiu2+2θi+1xi+1xi22θixixi12+αi+1V(xi)xi+12αixiV(xi)2θixiu2+θi1xi1u2+(13θi+1αi)xixi+12. $

    Lemma 3.5. Assume that Conditions 2.6 and 3.2 hold. Then $ \{x_i\} $ generated by Algorithm 3.1 is bounded.

    Proof. From (3.15) and Condition 3.2 (ii), there exists $ K > 0 $ such that

    $ xi+1uwiu=αiV(xi)+(1αi)xi+θi(xixi1)uαiV(xi)u+(1αi)xiu+θixixi1=αiV(xi)u+(1αi)xiu+αiθiαixixi1 $ (3.23)
    $ αiV(xi)u+(1αi)xiu+αiKαi(V(xi)V(u)+V(u)u)+(1αi)xiu+αiK $ (3.24)
    $ (1αi(1α))xiu+αi(1α)(V(u)u+K1α) $ (3.25)
    $ max{xiu,V(u)u+K1α}. $ (3.26)

    This implies that $ \|x_{i+1}-u\| \leqslant max\{\|x_1-u\|, \frac{\|V(u)-u\| + K}{1-\alpha} \}. $ This shows that $ \{x_i\} $ is bounded.

    Lemma 3.6. Assume that Conditions 2.6 and 3.2 hold. Let $ \{x_i\} $ be generated by Algorithm 3.1. For each $ i \geqslant 1 $, define

    $ ui=xiu2θi1xi1u2+2θixixi12+αixiV(xi)2. $

    Then $ u_i \geqslant 0 $.

    Proof. Since $ \{\theta_i\} $ is non-decreasing with $ 0 \leqslant \theta_i < \frac{1}{3} $, and $ 2\langle x, y \rangle = \|x\|^2 + \|y\|^2 -\|x-y\|^2 $ for all $ x, y \in H $, we have

    $ ui=xiu2θi1[xi1xi2+xiu2+2xi1xi,xiu]+2θixixi12+αixiV(xi)2=xiu2θi1[2xi1xi2+2xiu2xi12xi+u2]+2θixixi12+αixiV(xi)2xiu22θixi1xi223xiu2+θi1xi12xi+u2+2θixixi12+αixiV(xi)213xiu2+αixiV(xi)20. $

    This completes the proof.

    Lemma 3.7. Assume that Conditions 2.6 and 3.2 hold. Let $ \{x_i\} $ be generated by Algorithm 3.1. Suppose

    $ limixi+1xi=0, $

    and

    $ limi(xi+1u2θixiu2)=0. $

    Then $ \{x_i\} $ converges strongly to $ u \in EP(f, \mathcal{K}) $.

    Proof. By our assumptions, we have

    $ 0=limi(xi+1u2θixiu2)=limi[(xi+1u+θixiu)(xi+1uθixiu)]. $ (3.27)

    In the case

    $ limi(xi+1u+θixiu)=0, $

    this implies that $ \{x_i\} $ converges strongly to $ u $ immediately. Assume this limit does not hold. Then there is a subset $ N^* \subseteq N $ and a constant $ \rho > 0 $ such that

    $ xi+1u+θixiuρ, iN. $ (3.28)

    Using (3.27) and $ \theta_i \leqslant \theta < 1 $. For $ i \in N^* $, it then follows that

    $ 0=limi(xi+1uθixiu)lim supi(xiuxi+1xiθixiu)lim supi((1θ)xiuxi+1xi)=(1θ)lim supixiulimixi+1xi=(1θ)lim supixiu. $

    Consequently, we have $ \limsup \limits_{i \rightarrow \infty} \|x_i-u\| \leqslant 0 $. Since $ \liminf_{i \rightarrow \infty} \|x_i-u\| \geqslant 0 $ obviously holds, it follows that $ \lim\limits_{i \rightarrow \infty} \|x_i-u\| = 0 $. This implies (by (3.28))

    $ xi+1xixi+1uxiu=xi+1u+θixiu(1+θi)xiuρ2, $

    for all $ i \in N^* $ sufficiently large, a contradiction to the assumption that $ \lim\limits_{i \to \infty} \|x_{i+1}-x_i\| = 0 $. This completes the proof.

    We now give the following strong convergence result of Algorithm 3.1.

    Theorem 3.8. Assume that Conditions 2.6 and 3.2 hold. Then $ \{x_i\} $ generated by Algorithm 3.1 strongly converges to the solution $ u = P_{EP(f, \mathcal{K})} V(u) $.

    Proof. From Lemma 3.6 and (3.16), we have

    $ ui+1uiαi+1xi+1V(xi+1)2+αi+1V(xi)xi+12+(13θi+1αi)xixi+122αixiu,xiV(xi). $ (3.29)

    Since $ P_{EP(f, \mathcal{K})} V $ is contraction, by the Banach fixed point theorem, there exist unique $ u = P_{EP(f, \mathcal{K})} V(u) $. It follow from Lemma 3.3 that

    $ \|x_{i+1}-u\|^2 \leqslant \|w_i-u\|^2 \\ = \|\alpha_i(V(x_i)-u)+(1-\alpha_i)(x_i-u)+\theta_i(x_i-x_{i-1})\|^2 \\ \leqslant \|(1-\alpha_i)(x_i-u)+\theta_i(x_i-x_{i-1})\|^2 + 2\langle \alpha_i(V(x_i)-u), w_i-u \rangle \\ = \|(1-\alpha_i)(x_i-u)+\theta_i(x_i-x_{i-1})\|^2 + 2\alpha_i \langle V(x_i)-V(u), w_i-u \rangle + 2\alpha_i \langle V(u)-u, w_i-u \rangle \\ \leqslant \|(1-\alpha_i)(x_i-u)+\theta_i(x_i-x_{i-1})\|^2 + 2\alpha_i \langle V(u)-u, w_i-u \rangle + 2\alpha_i \alpha \| x_i-u\| \|w_i-u\| \\ \leqslant \|(1-\alpha_i)(x_i-u)+\theta_i(x_i-x_{i-1})\|^2 + 2\alpha_i \langle V(u)-u, w_i-u \rangle + \alpha_i \alpha ( \| x_i-u\|^2 + \|w_i-u\|^2 ) \\ \leqslant \frac{1}{1-\alpha_i \alpha} \bigg(\|(1-\alpha_i)(x_i-u) + \theta_i(x_i-x_{i-1})\|^2 + \alpha_i \alpha \|x_i-u\|^2 + 2\alpha_i\langle V(u)-u, w_i-u \rangle \bigg) \\ \leqslant \frac{1}{1-\alpha_i \alpha}\bigg( \|(1-\alpha_i)(x_i-u)\|^2 + 2\langle \theta_i(x_i-x_{i-1}), (1-\alpha_i)(x_i-u) +\theta_i(x_i-x_{i-1}) \rangle \\ \quad +\alpha_i \alpha \|x_i-u\|^2 + 2\alpha_i\langle V(u)-u, w_i-u \rangle \bigg) \\ = \frac{(1-\alpha_i)^2+\alpha_i \alpha}{1-\alpha_i \alpha}\|x_i-u\|^2 + \frac{1}{1-\alpha_i \alpha}\bigg(2\langle\theta_i(x_i-x_{i-1}), (1-\alpha_i)(x_i-u)+\theta_i(x_i-x_{i-1})\rangle \\ \quad + 2\alpha_i \langle V(u)-u, w_i-u\rangle \bigg) \\ = \big(1-(\frac{2\alpha_i(1-\alpha)}{1-\alpha_i \alpha} - \frac{(\alpha_i)^2}{1-\alpha_i \alpha} )\big)\|x_i-u\|^2+ \frac{1}{1-\alpha_i \alpha}\bigg( 2\langle \theta_i(x_i-x_{i-1}), (1-\alpha_i)(x_i-u) \\ \quad + \theta_i(x_i-x_{i-1}) \rangle + 2\alpha_i\langle V(u)-u, w_i-u \rangle \bigg) \\ \leqslant \big(1-\frac{2\alpha_i(1-\alpha)}{1-\alpha_i \alpha} \big)\|x_i-u\|^2 + \frac{2\alpha_i(1-\alpha)}{1-\alpha_i \alpha} \bigg( \frac{\alpha_i}{2(1-\alpha)}\|x_i-u\|^2 \\ \quad + \frac{1}{\alpha_i(1-\alpha)}\langle \theta_i(x_i-x_{i-1}), (1-\alpha_i)(x_i-u)+ \theta_i(x_i-x_{i-1}) \rangle \\ \quad + \frac{1}{1-\alpha}\langle V(u)-u, w_i-u \rangle \bigg). $ (3.30)

    We will consider into 2 cases.

    Case 1. In the case of $ u_{i+1} \leqslant u_i+t_i $ for all $ i \geqslant i_0 $ for some $ i_0 \in \mathbb{N} $, $ t_i \geqslant 0 $ and $ \sum_{i = 1}^{\infty} t_i < +\infty $. Then $ u_i \geqslant 0, \ \forall i \geqslant 1 $ by Lemma 2.5, we have $ \lim\limits_{i \to \infty} u_i = \lim\limits_{i \to \infty} u_{i+1} $ exists. Since $ \{x_i\} $ is bounded by Lemma 3.5, there exists $ M_1 > 0 $ such that $ 2|\langle x_i-u, x_i-V(x_i)\rangle| \leqslant M_1 $ and $ M_2 > 0 $ such that $ \|x_{i+1}-V(x_{i+1})\|^2 + \|V(x_i)-x_{i+1}\|^2 \leqslant M_2 $. Since $ 0 \leqslant \theta_i \leqslant \theta_{i+1} \leqslant \theta < \frac{1}{3} $ and $ \lim\limits_{i \to \infty} \alpha_i = 0 $, there exist $ N \in \mathbb{N} $ and $ \gamma_1 > 0 $ such that $ 1-3\theta_{i+1}-\alpha_i \geqslant \gamma_1 $ for all $ i \geqslant N $. Therefore, for $ i \geqslant N $, we obtain from (3.29) that

    $ γ1xi+1xi2αiM1+αi+1M2+uiui+10, $ (3.31)

    as $ \ i \rightarrow \infty $. Hence $ \lim\limits_{i \to \infty} \|x_{i+1}-x_i\| = 0 $. For $ u \in EP(f, \mathcal{K}) $, we have

    $ wiu2=αiV(xi)+(1αi)xi+θi(xixi1)u2αiV(xi)+(1αi)xiu2+2θi(xixi1),wiuαiV(xi)u2+(1αi)xiu2+2θixixi1wiuαiV(xi)u2+(1αi)xiu2+2θiαixixi1wiuαiV(xi)u2+xiu2+2θiαixixi1wiu, $ (3.32)

    and from (3.14), we have

    $ \|x_{i+1}-u\|^2 = \|w_i-u\|^2 - \tau(1-2\lambda_i c_1)\|w_i-y_i\|^2 - \tau(1-2\lambda_i c_2)\|z_i-y_i\|^2 \\ \quad -(1-\tau)\tau\frac{1}{\tau^2}\|x_{i+1}-w_i\|^2 \\ \leqslant \alpha_i\|V(x_i)-u\|^2 + \|x_i-u\|^2 + 2\frac{\theta_i}{\alpha_i}\|x_i-x_{i-1}\|\|w_i-u\| - \tau(1-2\lambda_i c_1)\|w_i-y_i\|^2 \\ \quad - \tau(1-2\lambda_i c_2)\|z_i-y_i\|^2 - \frac{1-\tau}{\tau}\|x_{i+1}-w_i\|^2. $ (3.33)

    This implies that

    $ τ(12λic1)wiyi2+τ(12λic2)ziyi2+1ττxi+1wi2αiV(xi)u2+xiu2+2θiαixixi1wiuxi+1u2. $ (3.34)

    By our condition and (3.31), we obtain

    $ limiwiyi=limiziyi=limixi+1wi=0. $ (3.35)

    Since $ \{x_i\} $ is bounded, that is, there exits a subsequence $ \{x_{i_k}\} $ of $ \{x_i\} $ such that $ x_{i_k} \rightharpoonup x^* $ for some $ x^* \in H $. From (3.31) and (3.35), we get $ w_{i_k} \rightharpoonup x^* $ and $ y_{i_k} \rightharpoonup x^* $ as $ k \rightarrow \infty $.

    By the definition of $ z_i $ and (3.6), we have

    $ λikf(yik,y)λikf(yik,zik)+wikzik,yzikλikf(wik,zik)λikf(wik,yik)c1wikyik2c2zikyik2+wikzik,yzikyikwik,yikzik+wikzik,yzikc1wikyik2c2zikyik2. $

    It follows from $ \{z_{i_k}\} $ is bounded, $ 0 < \lambda_{i_k} \leqslant \lambda < \frac{1}{2\max\{c_1, c_2\}} $ and Condition 2.6 (A3) that $ 0 \leqslant \limsup \limits_{k \to \infty} f(y_{i_k}, y) \leqslant f(x^*, y) $ for all $ y \in H $. This implies that $ f(x^*, y) \geqslant 0 $ for all $ y \in \mathcal{K} $, this shows that $ x^* \in EP(f, \mathcal{K}) $. Then, we have

    $ lim supiV(u)u,wiu=limkV(u)u,wiku=V(u)u,xu0, $ (3.36)

    by $ u = P_{EP(f, \mathcal{K})} V(u) $. Applying (3.36) to the inequality (3.30) with Lemma 2.4, we can conclude that $ x_i \rightarrow u = P_{EP(f, \mathcal{K})} V(u) $ as $ i \rightarrow \infty $.

    Case 2. In another case of $ \{u_i\} $, we let $ \phi : \mathbb{N} \rightarrow \mathbb{N} $ be the map defined for all $ i \geqslant i_0 $ (for some $ i_0 \in \mathbb{N} $ large enough) by

    $ ϕ(i)=max{kN:ki,uk+tkuk+1}. $ (3.37)

    Clearly, $ \phi(i) $ is a non-decreasing sequence such that $ \phi(i) \rightarrow \infty $ for $ i \rightarrow \infty $ and $ u_{\phi(i)}+t_{\phi(i)} \leqslant u_{\phi(i)+1} $ for all $ i \geqslant i_0 $. Hence, similar to the proof of Case 1, we therefore obtain from (3.31) that

    $ γ1xϕ(i)+1xϕ(i)2αϕ(i)M1+αϕ(i)+1M2+uϕ(i)uϕ(i)+10 $ (3.38)

    for some constant $ M_1, M_2 > 0 $. Thus

    $ limixϕ(i)+1xϕ(i)=0. $ (3.39)

    By the same proof of Case 1, one also derive

    $ limixϕ(i)+1wϕ(i)=limiwϕ(i)xϕ(i)=limixϕ(i)zϕ(i)=0. $ (3.40)

    Again observe that for $ j \geqslant 0 $ by (3.29), we have $ u_{j+1} < u_j+t_j $ when $ x_j \notin \Omega = \{x \in H : \langle x-x_0, x-u \rangle \leqslant 0\} $. Hence $ x_{\phi (i)} \in \Omega $ for all $ i \geqslant i_0 $ since $ u_{\phi (i)}+t_{\phi (i)} \leqslant u_{\phi (i)+1} $. Sine $ \{x_{\phi (i)}\} $ is bounded, there exist subsequence $ \{x_{\phi(i)}\} $ of $ \{x_{\phi (i)}\} $ which converges weakly to some $ x^* \in \mathcal{H} $. As $ \Omega $ is a closed and convex set, it is then weakly closed and so $ x^* \in \Omega $. Using (3.40), one can see as in Case 1 that $ z_{\phi (i)} \rightharpoonup x^* $ and $ x^* \in EP(f, \mathcal{K}) $ contains $ u $ as its only element. We therefore have $ x^* = u $. Furthermore,

    $ xϕ(i)u2=xϕ(i)V(xi),xϕ(i)uuV(xi),xϕ(i)uuV(xi),xϕ(i)u, $

    due to $ x_{\phi (i)} \in \Omega $. This gives

    $ lim supixϕ(i)u0. $

    Hence

    $ limixϕ(i)u=0. $ (3.41)

    By definition, $ u_{\phi (i) +1} $, we have

    $ u_{\phi (i)+1} = \|x_{\phi (i)+1} - u\|^2 - \theta_{\phi (i)}\|x_{\phi (i)} - u\|^2\\ + 2\theta_{\phi (i) + 1} \|x_{\phi (i) + 1} - x_{\phi (i)}\|^2 + \alpha_{\phi (i) + 1} \|x_{\phi (i) + 1} -V_{\phi (i)+1}\|^2 \\ \leqslant (\|x_{\phi (i)+1}-x_{\phi (i)}\|+\|x_{\phi (i)}-u\|)^2 - \theta_{\phi (i)}\|x_{\phi (i)}-u\|^2 \\ + 2\theta_{\phi (i) + 1} \|x_{\phi (i) + 1} - x_{\phi (i)}\|^2 \\ \quad + \alpha_{\phi (i) + 1} \|x_{\phi (i) + 1} -V_{\phi (i)+1}\|^2. $ (3.42)

    By our Condition 3.2 (i), (3.39) and (3.41), we obtain $ \lim\limits_{i \to \infty} u_{\phi(i)+1} = 0 $. We next show that we actually have $ \lim\limits_{i \to \infty} u_i = 0 $. To this end, first observe that, for $ i \geqslant i_0 $, one has $ u_i+t_i \leqslant u_{\phi (i) + 1} $ if $ i \neq \phi(i) $. It follows that for all $ i \geqslant i_0 $, we have $ u_i \leqslant \max\{u_{\phi(i)}, u_{\phi (i) + 1} \} = u_{\phi (i) + 1} \rightarrow 0 $, since $ \lim\limits_{i \to \infty} t_i = 0 $, hence $ \limsup \limits_{i \to \infty} u_i \leqslant 0 $. On the other hand, Lemma 3.6 implies that $ \liminf_{i \to \infty} u_i \geqslant 0 $. Hence, we obtain $ \lim\limits_{i \to \infty} u_i = 0 $. Consequently, the boundedness of $ \{x_i\} $, $ \lim\limits_{i \to \infty} \alpha_i = 0 $, and (3.29) show that $ \|x_i - x_{i+1}\| \rightarrow 0 $, as $ i \rightarrow \infty $. Hence the definition of $ u_i $ yields $ (\|x_{i+1}-u\|^2 - \theta_i \|x_i-u\|^2) \rightarrow 0 $, as $ i \rightarrow \infty $. By using Lemma 3.7, we obtain the desired conclusion immediately.

    Setting $ V(x) = x_0, \ \forall x\in \mathcal{H} $, then we obtain the following modified Halpern inertial extragradient algorithm for EPs:

    Algorithm 3.9. (Modified Halpern inertial extragradient algorithm for EPs)
    Initialization. Let $ x_0, x_1 \in H $, $ 0 < \lambda_i \leqslant \lambda < \frac{1}{2\max\{c_1, c_2\}} $, $ \tau \in (0, \frac{1}{2}] $.
    Step 1. Given the current iterates $ x_{i-1} $ and $ x_i (i \geqslant 1) $ and $ \alpha_i \in (0, 1) $, $ \theta_i \in [0, \frac{1}{3}) $, compute
    $ wi=αix0+(1αi)xi+θi(xixi1),yi=argminyK{λif(wi,y)+12ywi2}. $
    If $ y_i = w_i $ then stop and $ y_i $ is a solution. Otherwise, go to Step 2.
    Step 2. Compute
    $ zi=argminyK{λif(yi,y)+12ywi2}. $
    Step 3. Compute
    $ xi+1=(1τ)wi+τzi. $
    Set $ i = i+1 $ and return to Step 1.

     | Show Table
    DownLoad: CSV

    From Algorithm 3.1, the convergence depends on the parameter $ \{\lambda_i\} $ with the condition $ 0 < \lambda_{i}\leq\lambda < \frac{1}{2\max\{c_1, c_2\}} $. So, the step size $ \{\lambda_i\} $ can be considered in many ways. Applying step size concept of Shehu et al. [22], we then obtain the following modified viscosity type inertial extragradient stepsize algorithm for EPs:

    Algorithm 3.10. (Modified viscosity type inertial extragradient stepsize algorithm for EPs)
    Initialization. Let $ x_0, x_1 \in H $, $ \lambda_1 \in (0, \frac{1}{2\max\{c_1, c_2\}}) $, $ \mu \in (0, 1) $, $ \tau \in (0, \frac{1}{2}] $.
    Step 1. Given the current iterates $ x_{i-1} $ and $ x_i (i \geqslant 1) $ and $ \alpha_i \in (0, 1) $, $ \theta_i \in [0, \frac{1}{3}) $, compute
    $ wi=αiV(xi)+(1αi)xi+θi(xixi1),yi=argminyK{λif(wi,y)+12ywi2}. $
    If $ y_i = w_i $ then stop and $ y_i $ is a solution. Otherwise, go to Step 2.
    Step 2. Compute
    $ zi=argminyK{λif(yi,y)+12ywi2}. $
    Step 3. Compute
    $ xi+1=(1τ)wi+τzi, $
    and
    $ λi+1={min{μ2wiyi2+ziyi2f(wi,zi)f(wi,yi)f(yi,zi),λi}, if f(wi,zi)f(wi,yi)f(yi,zi)>0,λi, Otherwise. $
    Set $ i = i+1 $ and return to Step 1.

    Remark 3.11. (i) Since $ V(x) = x_0, \ \forall x\in \mathcal{H} $ is contraction, thus the modified Halpern inertial extragradient Algorithm 3.9 converges strongly to $ x^* = P_{EP(f, \mathcal{K})}x_0 $ with Conditions 2.6 and 3.2;

    (ii) Since the step size $ \{\lambda_i\} $ in Algorithm 3.10 is a monotonically decreasing sequence with lower bound $ \min\{\lambda_1, \frac{1}{2\max\{c_1, c_2\}}\} $ [22], thus Algorithm 3.10 converges strongly to the solution $ u = P_{EP(f, \mathcal{K})} V(u) $ by Theorem 3.8.

    We now give an example in infinitely dimensional spaces $ L_2[0, 1] $ to support the main theorem.

    Example 3.12. Let $ V : L_2[0, 1] \rightarrow L_2[0, 1] $ be defined by $ V(x(t)) = \frac{x(t)}{2} $ where $ x(t) \in L_2[0, 1] $. We can choose $ x_0(t) = \frac{sin(t)}{2} $ and $ x_1(t) = sin(t) $. The stopping criterion is defined by $ \|x_i-x_{i-1}\| < 10^{-2} $.

    We set the following parameters for each algorithm, as seen in Table 1.

    Table 1.  Chosen parameters of each algorithm.
    Algorithm 3.1 Algorithm 3.9 Algorithm 3.10
    $ \lambda_i $ 0.1 0.1 -
    $ \lambda_1 $ - - 0.12
    $ \theta_i $ 0.29 0.29 0.29
    $ \alpha_i $ $ \frac{1}{100i+1} $ $ \frac{1}{100i+1} $ $ \frac{1}{100i+1} $
    $ \tau_i $ 0.15 0.1 0.15
    $ \mu $ - - 0.2

     | Show Table
    DownLoad: CSV

    Next, we compare the performance of Algorithms 3.1, 3.9 and 3.10. We obtain the results as seen in Table 2.

    Table 2.  The performance of each algorithm.
    Algorithm 3.1 Algorithm 3.9 Algorithm 3.10
    CPU Time 1.2626 1.2010 177.9459
    Iter. No. 2 2 2

     | Show Table
    DownLoad: CSV

    From Figure 1, we see that the performance of Algorithm 3.10 is better than Algorithms 3.1 and 3.9.

    Figure 1.  The Cauchy error plotting number of iterations.

    According to the International Diabetes Federation (IDF), there are approximately 463 million people with diabetes worldwide, and it is estimated that by 2045 there will be 629 million people with diabetes. In Thailand, the incidence of diabetes is continuously increasing. There are about 300,000 new cases per year, and 3.2 million people with diabetes are registered in the Ministry of Public Health's registration system. They are causing huge losses in health care costs. Only one disease of diabetes causes the average cost of treatment costs up to 47,596 million baht per year. This has led to an ongoing campaign about the dangers of the disease. Furthermore, diabetes mellitus makes additional noncommunicable diseases that present a high risk for the patient, as they easily contact and are susceptible to infectious diseases such as COVID-19 [23]. Because it is a chronic disease that cannot be cured. There is a chance that the risk of complications spreading to the extent of the loss of vital organs of the body. By the International Diabetes Federation and the World Health Organization (WHO) has designated November 14 of each year as World Diabetes Day to recognize the importance of this disease.

    In this research, we used the PIMA Indians diabetes dataset which was downloaded from Kaggle (https://www.kaggle.com/uciml/pima-indians-diabetesdatabase) and is available publicly on UCI repository for training processing by our proposed algorithm. The dataset contains 768 pregnant female patients which 500 were non-diabetics and 268 were diabetics. There were 9 variables present inside the dataset; eight variables contain information about patients, and the 9th variable is the class predicting the patients as diabetic and nondiabetic. This dataset contains the various attributes that are Number of times pregnant; Plasma glucose concentration at 2 Hours in an oral glucose tolerance test (GTIT); Diastolic Blood Pressure (mm Hg); Triceps skin fold thickness (mm); 2-Hour Serum insulin (lh/ml); Body mass index [weight in kg/(Height in m)]; Diabetes pedigree function; Age (years); Binary value indicating non-diabetic /diabetic. For the implementation of machine learning algorithms, 614 were used as a training dataset and 154 were used as a testing training dataset by using 5-fold cross-validation [12]. For benchmarking classifier, we consider the following various methods which have been proposed to classify diabetes (see Table 3):

    Table 3.  Classification accuracy of different methods with literature.
    Authors Methods Accuracy (%)
    Li [13] Ensemble of SVM, ANN, and NB 58.3
    Brahim-Belhouari and Bermak [4] NB, SVM, DT 76.30
    Smith et al.[24] Neural ADAP algorithm 76
    Quinlan [17] C4.5 Decision trees 71.10
    Bozkurt et al.[3] Artificial neural network 76.0
    Sahan et al.[20] Artificial immune System 75.87
    Smith et al.[24] Ensemble of MLP and NB 64.1
    Chatrati et al.[5] Linear discriminant analysis 72
    Deng and Kasabov [7] Self-organizing maps 78.40
    Deng and Kasabov [7] Self-organizing maps 78.40
    Choubey et al. [6] Ensemble of RF and XB 78.9
    Saxena et al. [21] Feature selection of KNN, RF, DT, MLP 79.8
    Our Algorithm 3.1 Extreme learning machine 80.03

     | Show Table
    DownLoad: CSV

    We focus on extreme learning machine (ELM) proposed by Huang et al. [9] for applying our algorithms to solve data classification problems. It is defined as follows:

    Let $ E : = \{(\mathbf{x}_n, \mathbf{t}_n) : \mathbf{x}_n\in\mathbb{R}^n, \mathbf{t}_n\in\mathbb{R}^m, n = 1, 2, ..., P\} $ be a training set of $ P $ distinct samples where $ \mathbf{x}_n $ is an input training data and $ \mathbf{t}_n $ is a training target. The output function of ELM for single-hidden layer feed forward neural networks (SLFNs) with $ M $ hidden nodes and activation function $ U $ is

    $ On=Mj=1ΘjU(wj,bj,xn), $

    where $ w_j $ and $ b_j $ are parameters of weight and finally the bias, respectively and $ \Theta_j $ is the optimal output weight at the $ j $-th hidden node. The hidden layer output matrix $ H $ is defined as follows:

    $ H = \left[ U(w1,b1,x1)U(wM,bM,x1)U(w1,b1,xP)U(wM,bM,xP) \right] $

    To solve ELM is to find optimal output weight $ \Theta = [\Theta^T_1, ..., \Theta^T_M]^T $ such that $ H \Theta = T $, where $ T = [\mathbf{t}^T_1, ..., \mathbf{t}^T_P]^T $ is the training data. In some cases, finding $ \Theta = H^{\dagger} T $, where $ H^{\dagger} $ is the Moore-Penrose generalized inverse of $ H $. However, if $ H $ does not exist, then, finding such a solution $ \Theta $ through convex minimization can overcome such difficulty.

    In this section, we process some experiments on the classification problem. This problem can be seen as the following convex minimization problem:

    $ minΘRM{HΘT22+λΘ1}, $ (4.1)

    where $ \lambda $ is a regularization parameter. This problem is called the least absolute shrinkage and selection operator (LASSO) [26]. Setting $ f(\Theta, \zeta) = \langle H^T(H\Theta-T), \zeta-\Theta\rangle $ and $ V(x) = Cx $ where $ C $ is constant in (0, 1).

    The binary cross-entropy loss function along with sigmoid activation function for binary classification calculates the loss of an example by computing the following average:

    $ Loss=1KKj=1yjlogˆyj+(1yj)log(1ˆyj) $

    where $ \hat{y}_j $ is the $ j $-th scalar value in the model output, $ y_j $ is the corresponding target value, and $ K $ is the number of scalar values in the model output.

    In this work, the performance of machine learning techniques for all classes is accurately measured. The accuracy is calculated by adding the total number of correct predictions to the total number of predictions. The performance parameter calculation of precision and recall are measured. The formulation of three measures [27] are defined as follow:

    $ Precision(Pre)=TPTP+FP. $ (4.2)
    $ Recall(Rec)=TPTP+FN. $ (4.3)
    $ Accuracy(Acc)=TP+TNTP+FP+TN+FN×100%, $ (4.4)

    where a confusion matrix for original and predicted classes are shown in terms of $ \mathsf{TP} $, $ \mathsf{TN} $, $ \mathsf{FP} $, and $ \mathsf{FN} $ are the True Positive, True Negative, False Positives, and False Negatives, respectively. Similarly, $ \mathsf{P} $ and $ \mathsf{N} $ are the Positive and Negative population of Malignant and Benign cases, respectively.

    For starting our computation, we set the activation function as sigmoid, hidden nodes $ M = 160 $, regularization parameter $ \lambda = 1\times10^{-5} $, $ \theta_i = 0.3 $, $ \alpha_i = \frac{1}{i+1} $, $ \tau = 0.5 $, $ \mu = 0.2 $ for Algorithms 3.1, 3.9 and 3.10 and $ C = 0.9999 $ for Algorithms 3.1 and 3.10. The stopping criteria is the number of iteration 250. We obtain the results of the different parameters $ S $ when $ \lambda_i = \frac{S}{max(eigenvalue(A^{T}A))} $ for Algorithms 3.1, 3.9 and the different parameters $ \lambda_1 $ for Algorithm 3.10 as seen in Table 4.

    Table 4.  Training and validation loss and training time of the different parameter $ \lambda_i $ and $ \lambda_1 $.
    Loss
    $ S $, $ \lambda_1 $ Training Time Training Validation
    0.2 0.4164 0.286963 0.275532
    0.4 0.4337 0.283279 0.273650
    Algorithm 3.1 0.6 0.4164 0.286963 0.275532
    0.9 0.4459 0.278714 0.272924
    0.99 0.4642 0.278144 0.272921
    0.2 0.4283 0.291883 0.279878
    0.4 0.5293 0.288831 0.277365
    Algorithm 3.9 0.6 0.4246 0.286890 0.276099
    0.9 0.4247 0.284851 0.275079
    0.99 0.5096 0.284356 0.274879
    0.2 1.3823 0.286963 0.275532
    0.4 1.5652 0.283279 0.273650
    Algorithm 3.10 0.6 1.4022 0.281060 0.273120
    0.9 1.9170 0.278714 0.272924
    0.99 1.3627 0.278144 0.272921

     | Show Table
    DownLoad: CSV

    We can see that $ \lambda_i = \lambda_1 = 0.99 $ greatly improves the performance of Algorithms 3.1, 3.9 and 3.10. Therefore, we choose it as the default inertial parameter for next computation.

    We next choose $ \lambda_i = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \alpha_i = \frac{1}{i+1} $, $ \tau = 0.5 $ for Algorithms 3.1 and 3.9 and $ C = 0.9999 $ for Algorithm 3.1 with $ \lambda_1 = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \alpha_i = \frac{1}{i+1} $, $ \tau = 0.5 $, $ C = 0.9999 $, and $ \mu = 0.2 $ for Algorithm 3.10. The stopping criteria is the number of iteration 250. We consider the different initialization parameter $ \theta $ where

    $ θi={θ                                 if xi=xi1 and iN,θi2xixi1                     otherwise, $

    where $ N $ is a number of iterations that we want to stop. We obtain the numerical results as seen in Table 5.

    Table 5.  Training and validation loss and training time of the different parameter $ \theta $.
    Loss
    $ \theta $ Training Time Training Validation
    0.1 0.4608 0.279629 0.272965
    0.2 0.4515 0.278938 0.272931
    Algorithm 3.1 0.3 0.4523 0.278144 0.272921
    $ \frac{1}{i} $ 0.4591 0.280107 0.273004
    $ \frac{1}{\|x_i-x_{i-1}\|+i^2} $ 0.5003 0.280221 0.273015
    0.1 0.4723 0.284808 0.274993
    0.2 0.4641 0.284587 0.274935
    Algorithm 3.9 0.3 0.4634 0.284356 0.274879
    $ \frac{1}{i} $ 0.5297 0.285004 0.275049
    $ \frac{1}{\|x_i-x_{i-1}\|+i^2} $ 0.4825 0.285019 0.275053
    0.1 1.4071 0.279629 0.272965
    0.2 1.3505 0.278938 0.272931
    Algorithm 3.10 0.3 1.4819 0.278144 0.272921
    $ \frac{1}{i} $ 1.3276 0.280107 0.273004
    $ \frac{1}{\|x_i-x_{i-1}\|+i^2} $ 1.4228 0.280221 0.273015

     | Show Table
    DownLoad: CSV

    We can see that $ \theta = 0.3 $ greatly improves the performance of Algorithms 3.1, 3.9 and 3.10. Therefore, we choose it as the default inertial parameter for next computation.

    We next set $ \lambda_i = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ \tau = 0.5 $ for Algorithms 3.1 and 3.9 and $ C = 0.9999 $ for Algorithm 3.1 with $ \lambda_1 = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ \tau = 0.5 $, $ C = 0.9999 $, and $ \mu = 0.2 $ for Algorithm 3.10. The stopping criteria is the number of iteration 250. We consider the different initialization parameter $ \alpha_i $. The numerical results are shown in Table 6.

    Table 6.  Training and validation loss and training time of the different parameter $ \alpha_i $.
    Loss
    $ \alpha_i $ Training Time Training Validation
    $ \frac{1}{i+1} $ 0.4407 0.278144 0.272921
    Algorithm 3.1 $ \frac{1}{10i+1} $ 0.4054 0.278143 0.272921
    $ \frac{1}{i^2+1} $ 0.4938 0.278143 0.272921
    $ \frac{1}{10i^2+1} $ 0.4876 0.278143 0.272921
    $ \frac{1}{i+1} $ 0.4163 0.284356 0.274879
    Algorithm 3.9 $ \frac{1}{10i+1} $ 0.4274 0.279201 0.273129
    $ \frac{1}{i^2+1} $ 0.5150 0.278294 0.272931
    $ \frac{1}{10i^2+1} $ 0.5960 0.278160 0.272922
    $ \frac{1}{i+1} $ 1.4292 0.278144 0.272921
    Algorithm 3.10 $ \frac{1}{10i+1} $ 1.3803 0.278143 0.272921
    $ \frac{1}{i^2+1} $ 1.2452 0.278143 0.272921
    $ \frac{1}{10i^2+1} $ 1.4100 0.278143 0.272921

     | Show Table
    DownLoad: CSV

    We can see that $ \alpha_i = \frac{1}{10i+1} $ greatly improves the performance of Algorithm 3.1, $ \alpha_i = \frac{1}{10i^2+1} $ greatly improves the performance of Algorithm 3.9, and $ \alpha_i = \frac{1}{i^2+1} $ greatly improves the performance of Algorithm 3.10. Therefore, we choose it as the default inertial parameter for next computation.

    We next calculate the numerical results by setting $ \lambda_i = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ \alpha_i = \frac{1}{10i+1} $ and $ C = 0.9999 $ for Algorithm 3.1, $ \lambda_i = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ \alpha_i = \frac{1}{10i^2+1} $ for Algorithm 3.9 and $ \lambda_1 = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ C = 0.9999 $, $ \alpha_i = \frac{1}{i^2+1} $, and $ \mu = 0.2 $ for Algorithm 3.10. The stopping criteria is the number of iteration 250. We consider the different initialization parameter $ \tau $. The numerical results are shown in Table 7.

    Table 7.  Training and validation loss and training time of the different parameter $ \tau $.
    Loss
    $ \tau $ Training Time Training Validation
    0.1 0.4278 0.300531 0.299144
    Algorithm 3.1 0.3 0.4509 0.299074 0.293717
    0.5 0.5239 0.278143 0.272921
    $ \frac{i}{2i+1} $ 0.4708 0.282187 0.274017
    0.1 0.4592 0.300531 0.299144
    Algorithm 3.9 0.3 0.4900 0.299074 0.293717
    0.5 0.4261 0.278160 0.272922
    $ \frac{i}{2i+1} $ 0.5224 0.282191 0.274018
    0.1 1.3401 0.300531 0.299144
    Algorithm 3.10 0.3 1.3771 0.299074 0.293717
    0.5 1.8681 0.278143 0.272921
    $ \frac{i}{2i+1} $ 1.4671 0.282187 0.274017

     | Show Table
    DownLoad: CSV

    We can see that $ \tau = 0.5 $ greatly improves the performance of Algorithms 3.1, 3.9 and 3.10. Therefore, we choose it as the default inertial parameter for next computation.

    We next calculate the numerical results by setting $ \lambda_i = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ \tau = 0.5 $ for Algorithms 3.1 and 3.9 and $ \alpha_i = \frac{1}{10i+1} $ for Algorithm 3.1 with $ \alpha_i = \frac{1}{10i^2+1} $ for Algorithm 3.9 and $ \lambda_1 = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ \alpha_i = \frac{1}{i^2+1} $, $ \tau = 0.5 $, and $ \mu = 0.2 $ for Algorithm 3.10. The stopping criteria is the number of iteration 250. We obtain the results of the different parameters $ C $ when $ V(x) = Cx $ for Algorithms 3.1 and 3.10 as seen in Table 8.

    Table 8.  Training and validation loss and training time of the different parameter $ c $.
    Loss
    $ C $ Training Time Training Validation
    0.3 0.4796 0.278902 0.273066
    0.5 0.4270 0.278695 0.273024
    Algorithm 3.1 0.7 0.4190 0.278480 0.272982
    0.9 0.4209 0.278257 0.272941
    0.9999 0.4844 0.278143 0.272921
    0.3 1.5886 0.278251 0.272928
    0.5 1.6358 0.278222 0.272926
    Algorithm 3.10 0.7 1.3808 0.278191 0.272924
    0.9 1.5176 0.278159 0.272922
    0.9999 1.4598 0.278143 0.272921

     | Show Table
    DownLoad: CSV

    From Tables 38, we choose the parameters for Algorithm 3.1 to compare the exist algorithms from the literature. The following Table 9 shows choosing the necessary parameters of each algorithm.

    Table 9.  Chosen parameters of each algorithm.
    Algorithm in (1.2) Algorithm in (1.4) Algorithm in (1.6) Algorithm 3.1 Algorithm 3.9 Algorithm 3.10
    $ \mu $ - 0.3 0.3 - - 0.2
    $ \lambda_1 $ - $ \frac{0.5}{max(eig(A^{T}A))} $ $ \frac{0.9999}{max(eig(A^{T}A))} $ - - $ \frac{0.99}{max(eig(A^{T}A))} $
    $ \lambda_i $ $ \frac{0.5}{max(eig(A^{T}A))} $ - - $ \frac{0.99}{max(eig(A^{T}A))} $ $ \frac{0.99}{max(eig(A^{T}A))} $ -
    $ \theta_i $ - - 0.3 0.3 0.3 0.3
    $ \alpha_i $ $ \frac{1}{100i+1} $ $ \frac{1}{100i+1} $ $ \frac{1}{2i+1} $ $ \frac{1}{10i+1} $ $ \frac{1}{10i^2+1} $ $ \frac{1}{i^2+1} $
    $ \tau $ - - 0.5 0.5 0.5 0.5
    $ c $ - - - 0.9999 - 0.9999

     | Show Table
    DownLoad: CSV

    For comparison, We set sigmoid as an activation function, hidden nodes $ M = 160 $ and regularization parameter $ \lambda = 1 \times 10^{-5} $.

    Table 10 shows that Algorithm 3.1 has the highest efficiency in precision, recall, and accuracy. It also has the lowest number of iterations. It has the highest probability of correctly classifying tumors compared to algorithms examinations. We present the training and validation loss with the accuracy of training to show that Algorithm 3.1 has no overfitting in the training dataset.

    Table 10.  The performance of each algorithm.
    Algorithm Iteration No. Training Time Pre Rec Acc $ (\%) $
    Algorithm in (1.2) 25 0.0537 80.97 97.50 80.03
    Algorithm in (1.4) 25 0.3132 80.97 97.50 80.03
    Algorithm in (1.6) 30 0.1182 80.97 97.50 80.03
    Algorithm 3.1 18 0.0375 80.97 97.50 80.03
    Algorithm 3.9 18 0.0401 80.97 97.50 80.03
    Algorithm 3.10 18 0.1045 80.97 97.50 80.03

     | Show Table
    DownLoad: CSV

    From Figures 2 and 3, we see that our Algorithm 3.1 has good fit model this means that our Algorithm 3.1 suitably learns the training dataset and generalizes well to classification the PIMA Indians diabetes dataset.

    Figure 2.  Training and validation loss plots of Algorithm 3.1.
    Figure 3.  Training and validation accuracy plots of Algorithm 3.1.

    In general, screening for diabetes in pregnancy we use The American College of Obstetricians and Gynecologists (ACOG) recommendations. The accuracy of our method is $ 80.03\% $ and high accuracy may use for predict correctly diabetes in pregnancy in the future.

    In this paper, we introduce a modified extragradient method with inertial extrapolation step and viscosity-type method to solve equilibrium problems of pseudomonotone bifunction operator in real Hilbert spaces. We then prove the strong convergence theorem of the proposed algorithm under the assumption that the bifunction satisfies the Lipchitz-type condition. Moreover, we show choosing stepsize parameter $ \{\lambda_i\} $ in many ways, this shows that our algorithm is flexible using, see in Algorithms 3.1 and 3.10. Finally, we show our algorithms are better performance than existing algorithms to solve the diabetes mellitus classification in machine learning.

    This research was also supported by Fundamental Fund 2022, Chiang Mai University and the NSRF via the Program Management Unit for Human Resources and Institutional Development, Research and Innovation (grant number B05F640183). W. Cholamjiak would like to thank National Research Council of Thailand (N42A650334) and Thailand Science Research and Innovation, the University of Phayao (FF65-UOE).

    The authors declare no conflict of interest.

    [1] N. Ansini, Gradient flows with wiggly potential: a variational approach to the dynamics, in Mathematical Analysis of Continuum Mechanics and Industrial Applications Ⅱ, CoMFoS16 (Springer), 30 (2017), 139-151. doi: 10.1007/978-981-10-6283-4_12
    [2] N. Ansini, A. Braides and J. Zimmer, Minimising movements for oscillating energies: The critical regime, Proceedings of the Royal Society of Edinburgh Section A (in press), 2016, arXiv: 1605.01885.
    [3] Curvature-driven flows: a variational approach. SIAM Journal of Control and Optimization (1993) 31: 387-438.
    [4] L. Ambrosio, N. Gigli and G. Savaré, Gradient Flows in Metric Spaces and in the Space of Probability Measures, Birkhauser Verlag, second edition, 2008.
    [5] A. Braides, Γ-convergence for Beginners, Oxford University Press, 2002. doi: 10.1093/acprof:oso/9780198507840.001.0001
    [6] A. Braides, Local Minimization, Variational Evolution and Γ-convergence, Springer Cham, 2014. doi: 10.1007/978-3-319-01982-6
    [7] Minimizing movements along a sequence of functionals and curves of maximal slope. Comptes Rendus Matematique (2005) 354: 685-689.
    [8] M. Colombo and M. Gobbino, Passing to the limit in maximal slope curves: from a regularized Perona-Malik to the total variation flow, Mathematical Models and Methods in Applied Sciences, 22 (2012), 1250017, 19pp. doi: 10.1142/S0218202512500170
    [9] L. C. Evans, Partial Differential Equations, American Mathematical Society, second edition, 2010. doi: 10.1090/gsm/019
    [10] F. Fleissner, Γ-convergence and relaxations for gradient flows in metric spaces: A minimizing movement approach, ESAIM Control, Optimisation and Calculus of Variations (to appear), preprint, arXiv: 1603.02822, (2016).
    [11] F. Fleissner and G. Savaré, Reverse approximation of gradient flows as minimizing movements: A conjecture by De Giorgi, Forthcoming Articles, (2018), 30pp, arXiv: 1711.07256. doi: 10.2422/2036-2145.201711_008
    [12] The variational formulation of the Fokker-Plank equation. SIAM Journal of Mathematical Analysis (1998) 29: 1-17.
    [13] Gamma-convergence of gradient flows and application to Ginzburg-Landau. Communications on Pure and Applied Mathematics (2004) 57: 1627-1672.
  • This article has been cited by:

    1. Godwin Chidi Ugwunnadi, Abdul Rahim Khan, Austine Efut Ofem, Convergence results for Górnicki type contractive mapping in CAT(0) spaces, 2024, 0971-3611, 10.1007/s41478-024-00864-8
  • Reader Comments
  • © 2018 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(6345) PDF downloads(355) Cited by(2)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog