1.
Introduction
Let $ \mathcal{H} $ be a real Hilbert space and $ \mathcal{K} $ is a nonempty closed and convex subset of a real Hilbert space $ \mathcal{H} $. The equilibrium problem (EP) is to find an element $ u^* \in \mathcal{K} $ such that
where $ f : \mathcal{K} \times \mathcal{K} \rightarrow \mathbb{R} $ is a bifunction and satisfying $ f(z, z) = 0 $ for all $ z \in \mathcal{H} $, and $ EP(f, \mathcal{K}) $ is denoted for a solution set of EP (1.1). EP (1.1) generalizes many various problems in optimization problems such as variational inequalities problems, Nash equilibrium problems, linear programming problems, among others.
To solve the problem EP (1.1), the two-step extragradient method (TSEM) was proposed by Tran et al. [25] in 2008 which is inspired by the concept ideas to solve variational inequalities of Korpelevich [10]. Under the Lipschitz condition of the bifunction $ f $ the convergence theorem was proved. However, only weak convergence was obtained in Hilbert spaces. For obtaining strong convergence, this goal was completed by Hieu [8] with Halpern subgradient extragradient method (HSEM) which was modified from the HSEM of Kraikaew and Saejung [11] for variational inequalities. This method is defined by $ u\in \mathcal{H} $ and
where $ \lambda $ is still some constant depending on the interval that makes the bifunction $ f $ satisfies the Lipschitz condition and $ \{\alpha_i\}\subset (0, 1) $ which satisfies the principle conditions
Very recently, Muangchoo [14] combined a viscosity-type method with the extragradient algorithm for obtaining strong convergence theorem of the EP (1.1). This method is defined by
where $ \mu\in\left(0, \sigma\right)\subset\left(0, \min\left\lbrace1, \frac{1}{2c_{1}}, \frac{1}{2c_{2}}\right\rbrace\right) $, $ g $ is a contraction function on $ \mathcal{H} $ with contraction constant $ \xi\in[0, 1) $($ \|g(x)-g(y)\|\leq\xi\|x-y\|, \ \forall x, y\in \mathcal{H} $), $ \{\alpha_i\} $ satisfies the principle conditions $ \lim\limits_{i \to \infty} \alpha_i = 0 $ and $ \sum_{i}^{+\infty} \alpha_i = +\infty $, and the step-sizes $ \{\lambda_i\} $ is developed by updating the step-sizes method without knowing the Lipschitz-type constants of the bifunction $ f $ which satisfies the following:
where $ K_i = f(z_i, t_i)-f(z_i, v_i)-c_{1}\|z_{i}-v_{i}\|^{2}-c_{2}\|t_{i}-v_{i}\|^{2}+1 $.
Finding technique to speed up the convergence of the algorithm is the way that many mathematicians are interested. One of that is an inertial which was first introduced by Polyak [16]. Very recently, Shehu et al. [22] modified the inertial technique with the Halpern-type algorithm and subgradient extragradient method for obtaining strong convergence to a solution of $ EP(f, \mathcal{K}) $ such that $ f $ is pseudomonotone. This method is defined by $ \gamma \in \mathcal{H} $ and
where the inertial parameter $ \delta_i\in[0, \frac{1}{3}) $, $ \tau\in(0, \frac{1}{2}] $, the update step-size $ \{\lambda_i\} $ satisfies the following:
and $ \{\alpha_i\} $ still satisfies the principle conditions $ \lim\limits_{i \to \infty} \alpha_i = 0 $ and $ \sum_{i}^{\infty} \alpha_i = \infty $. This update step-size $ \{\lambda_i\} $ is limited in the computation and can not be modified in another way.
In this paper, motivated and inspired by the above works in literature, we introduce a modified inertial viscosity extragradient type method for solving equilibrium problems. Moreover, we apply our main result to solve a data classification problem in machine learning and show the performance of our algorithm by comparing with many existing methods.
2.
Preliminaries
Let us begin with some concepts of monotonicity of a bifunction [2,15]. Let $ \mathcal{K} $ be a nonempty, closed and convex subset of $ \mathcal{H} $. A bifunction $ f : \mathcal{H} \times \mathcal{H} \rightarrow \mathbb{R} $ is said to be:
(i) strongly monotone on $ \mathcal{K} $ if there exists a constant $ \gamma > 0 $ such that
(ii) mototone on $ \mathcal{K} $ if $ f(x, y) \leqslant - f(y, x), \forall x, y \in \mathcal{K}; $
(iii) pseudomonotone on $ \mathcal{K} $ if $ f(x, y) \geqslant 0 \Longrightarrow f(y, x) \leqslant 0, \forall x, y \in \mathcal{K}; $
(iv) satisfying Lipschitz-type condition on $ \mathcal{K} $ if there exist two positive constants $ c_1, c_2 $ such that
The normal cone $ N_\mathcal{K} $ to $ \mathcal{K} $ at a point $ x \in \mathcal{K} $ is defined by $ N_\mathcal{K} (x) = \{ w \in \mathcal{H} : \langle w, x-y \rangle \geqslant 0, \forall y \in \mathcal{K} \} $. For every $ x \in \mathcal{H} $, the metric projection $ P_\mathcal{K} x $ of $ x $ onto $ \mathcal{K} $ is the nearest point of $ x $ in $ \mathcal{K} $, that is, $ P_\mathcal{K} x = \arg\min\{\|y-x\| : y \in \mathcal{K} \} $. Since $ \mathcal{K} $ is nonempty, closed and convex, $ P_\mathcal{K} x $ exists and is unique. For each $ x, z \in \mathcal{H} $, by $ \partial_2 f(z, x) $, we denote the subdifferential of convex function $ f(z, .) $ at $ x $, i.e.,
In particular, for $ z \in \mathcal{K} $,
For proving the convergence of the proposed algorithm, we need the following lemmas.
Lemma 2.1. [1] For each $ x \in \mathcal{H} $ and $ \lambda > 0 $,
where $ prox_{\lambda g} (x) = \arg\min\{ \lambda g(y) + \frac{1}{2} \|x-y\|^2 : y \in \mathcal{K} \} $.
For any point $ u \in \mathcal{H} $, there exists point $ P_\mathcal{K} u \in \mathcal{K} $ such that
$ P_\mathcal{K} $ is called the metric projection of $ \mathcal{H} $ onto $ \mathcal{K} $. We know that $ P_\mathcal{K} $ is a nonexpansive mapping of $ \mathcal{H} $ onto $ \mathcal{K} $, that is, $ \|P_\mathcal{K}x-P_\mathcal{K}y\|\leq \|x-y\|, \ \forall x, y \in \mathcal{K} $.
Lemma 2.2. [18] Let $ g : \mathcal{K} \rightarrow \mathbb{R} $ be a convex, subdifferentiable and lower semicontinuous function on $ \mathcal{K} $. Suppose the convex set $ \mathcal{K} $ has nonempty interior, or $ g $ is continuous at a point $ x^* \in \mathcal{K} $. Then, $ x^* $ is a solution to the following convex problem $ \min\{g(x) : x \in \mathcal{K}\} $ if and only if $ 0 \in \partial g(x^*) + N_\mathcal{K} (x^*) $, where $ \partial g(.) $ denotes the subdifferential of $ g $ and $ N_\mathcal{K} (x^*) $ is the normal cone of $ \mathcal{K} $ at $ x^* $.
Lemma 2.3. [19,22] Let $ S \subseteq \mathbb{R} $ be a nonempty, closed, and convex subset of a real Hilbert space $ \mathcal{H} $. Let $ u \in \mathcal{H} $ be arbitrarity given, $ z = P_S u $, and $ \Omega = \{ x \in \mathcal{H} : \langle x-u, x-z \rangle \leqslant 0 \} $. Then $ \Omega \cap S = \{z\} $.
Lemma 2.4. [28] Let $ \{a_i\} $ and $ \{c_i\} $ be nonnegative sequences of real numbers such that $ \sum_{i = 1}^{\infty} c_i < \infty $, and let $ \{b_i\} $ be a sequence of real numbers such that $ \limsup \limits_{i \to \infty} b_i \leqslant 0 $. If for any $ i \in N $ such that
where $ \{\gamma_i\} $ is a sequence in $ (0, 1) $ such that $ \sum_{i = 1}^{\infty} \gamma_i = \infty $, then $ \lim\limits_{i \to \infty} a_i = 0 $.
Lemma 2.5. [29] Let $ \{a_i\} $, $ \{b_i\} $ and $ \{c_i\} $ be positive sequences, such that
If $ \sum_{i = 1}^{ \infty} c_i < +\infty $ and $ \sum_{i = 1}^{\infty} b_i < +\infty $; then, $ \lim\nolimits_{i \to +\infty} a_i $ exists.
The convergence of Algorithm 3.1 will be given under the conditions that
Condition 2.6. (A1) $ f $ is pseudomonotone on $ \mathcal{K} $ with $ int(\mathcal{K}) \neq \emptyset $ or $ f(x, .) $ is continuous at some $ z \in \mathcal{K} $ for every $ x \in \mathcal{K} $;
(A2) $ f $ satisfies Lipschitz-type condition on $ \mathcal{H} $ with two constants $ c_1 $ and $ c_2 $;
(A3) $ f(., y) $ is sequentially weakly upper semicontinuous on $ \mathcal{K} $ for each fixed point $ y \in \mathcal{K} $, i.e., if $ \{x_i\} \subset \mathcal{K} $ is a sequence converging weakly to $ x \in \mathcal{K} $, then $ f(x, y) \geqslant \limsup \limits_{i \to \infty} f(x_i, y) $;
(A4) for $ x \in \mathcal{H} $, $ f(x, .) $ is convex and lower semicontinuous, subdifferentiable on $ \mathcal{H} $;
(A5) $ V : \mathcal{H} \rightarrow \mathcal{H} $ is contraction with contraction constant $ \alpha $.
3.
Main results
Now, we are in a position to present a modification of algorithm (EGM) in [25] for equilibrium problems.
In this section, we will analyse the convergence of Algorithm 3.1.
For the rest of this paper, we assume the following condition.
Condition 3.2. (i) $ \{\alpha_i\} \subset (0, 1] $ is non-increasing with $ \lim\limits_{i \to \infty}{\alpha_i} = 0 $ and $ \sum\limits_{i = 1}^{\infty}{\alpha_i} = \infty $;
(ii) $ 0 \leq \theta_{i} \leq \theta_{i+1} \leq \theta < \frac{1}{3} $ and $ \lim\limits_{i \to \infty} \frac{\theta_i}{\alpha_i} \|x_i - x_{i-1} \| = 0 $;
(iii) $ EP(f, \mathcal{K}) \not = \emptyset $.
Before we prove the strong convergence result, we need some lemmas below.
Lemma 3.3. Assume that Conditions 2.6 and 3.2 hold. Let $ \{x_{i}\} $ be generated by Algorithm 3.1. Then there exists $ N > 0 $ such that
Proof. By the definition of $ y_i $, and Lemma 2.1, we have
Putting $ y = z_{i} $ into (3.2), we obtain
By the definition of $ z_i $, we have
(3.3) and (3.4) imply that
If $ f(w_i, z_i) - f(w_i, y_i) - f(y_i, z_i) > 0 $, then
Observe that (3.6) is also satisfied if $ f(w_i, z_i) - f(w_i, y_i) - f(y_i, z_i) \leq 0 $. By (3.5) and (3.6), we have
Note that
and
Using (3.8) and (3.9) in (3.7), we obtain, for all $ y \in \mathcal{K} $,
Taking $ y = u \in EP(f, \mathcal{K}) \subset \mathcal{K} $, one has $ f(u, y_i) \geqslant 0, \forall i $. By (A1), we obtain $ f(y_i, u) \leqslant 0, \ \forall i $. Hence, we obtain from (3.10) that
It follows from $ \lambda_i \in (0, \frac{1}{2\max\{c_1, c_2\}}) $ and (3.11), we have
On the other hand, we have
Substituting (3.11) into (3.13), we obtain
Moreover, we have $ z_i-w_i = \frac{1}{\tau}(x_{i+1}-w_i) $, which together with (3.14) gives
where $ \epsilon = \frac{1-\tau}{\tau} $.
Lemma 3.4. Assume that Conditions 2.6 and 3.2 hold. Let $ \{x_i\} $ be generated by Algorithm 3.1. Then, for all $ u \in EP(f, \mathcal{K}) $,
Proof. By Lemma 2.5, we have
Moreover, from the definition of $ w_i $, we obtain that
Replacing $ u $ by $ x_{i+1} $ in (3.18), we have
Substituting (3.18) and (3.19) into (3.17), we have
Therefore, we obtain
It follows that
Since $ {\theta_i} $ is non-decreasing and $ \alpha_i $ is non-increasing, we then obtain
Lemma 3.5. Assume that Conditions 2.6 and 3.2 hold. Then $ \{x_i\} $ generated by Algorithm 3.1 is bounded.
Proof. From (3.15) and Condition 3.2 (ii), there exists $ K > 0 $ such that
This implies that $ \|x_{i+1}-u\| \leqslant max\{\|x_1-u\|, \frac{\|V(u)-u\| + K}{1-\alpha} \}. $ This shows that $ \{x_i\} $ is bounded.
Lemma 3.6. Assume that Conditions 2.6 and 3.2 hold. Let $ \{x_i\} $ be generated by Algorithm 3.1. For each $ i \geqslant 1 $, define
Then $ u_i \geqslant 0 $.
Proof. Since $ \{\theta_i\} $ is non-decreasing with $ 0 \leqslant \theta_i < \frac{1}{3} $, and $ 2\langle x, y \rangle = \|x\|^2 + \|y\|^2 -\|x-y\|^2 $ for all $ x, y \in H $, we have
This completes the proof.
Lemma 3.7. Assume that Conditions 2.6 and 3.2 hold. Let $ \{x_i\} $ be generated by Algorithm 3.1. Suppose
and
Then $ \{x_i\} $ converges strongly to $ u \in EP(f, \mathcal{K}) $.
Proof. By our assumptions, we have
In the case
this implies that $ \{x_i\} $ converges strongly to $ u $ immediately. Assume this limit does not hold. Then there is a subset $ N^* \subseteq N $ and a constant $ \rho > 0 $ such that
Using (3.27) and $ \theta_i \leqslant \theta < 1 $. For $ i \in N^* $, it then follows that
Consequently, we have $ \limsup \limits_{i \rightarrow \infty} \|x_i-u\| \leqslant 0 $. Since $ \liminf_{i \rightarrow \infty} \|x_i-u\| \geqslant 0 $ obviously holds, it follows that $ \lim\limits_{i \rightarrow \infty} \|x_i-u\| = 0 $. This implies (by (3.28))
for all $ i \in N^* $ sufficiently large, a contradiction to the assumption that $ \lim\limits_{i \to \infty} \|x_{i+1}-x_i\| = 0 $. This completes the proof.
We now give the following strong convergence result of Algorithm 3.1.
Theorem 3.8. Assume that Conditions 2.6 and 3.2 hold. Then $ \{x_i\} $ generated by Algorithm 3.1 strongly converges to the solution $ u = P_{EP(f, \mathcal{K})} V(u) $.
Proof. From Lemma 3.6 and (3.16), we have
Since $ P_{EP(f, \mathcal{K})} V $ is contraction, by the Banach fixed point theorem, there exist unique $ u = P_{EP(f, \mathcal{K})} V(u) $. It follow from Lemma 3.3 that
We will consider into 2 cases.
Case 1. In the case of $ u_{i+1} \leqslant u_i+t_i $ for all $ i \geqslant i_0 $ for some $ i_0 \in \mathbb{N} $, $ t_i \geqslant 0 $ and $ \sum_{i = 1}^{\infty} t_i < +\infty $. Then $ u_i \geqslant 0, \ \forall i \geqslant 1 $ by Lemma 2.5, we have $ \lim\limits_{i \to \infty} u_i = \lim\limits_{i \to \infty} u_{i+1} $ exists. Since $ \{x_i\} $ is bounded by Lemma 3.5, there exists $ M_1 > 0 $ such that $ 2|\langle x_i-u, x_i-V(x_i)\rangle| \leqslant M_1 $ and $ M_2 > 0 $ such that $ \|x_{i+1}-V(x_{i+1})\|^2 + \|V(x_i)-x_{i+1}\|^2 \leqslant M_2 $. Since $ 0 \leqslant \theta_i \leqslant \theta_{i+1} \leqslant \theta < \frac{1}{3} $ and $ \lim\limits_{i \to \infty} \alpha_i = 0 $, there exist $ N \in \mathbb{N} $ and $ \gamma_1 > 0 $ such that $ 1-3\theta_{i+1}-\alpha_i \geqslant \gamma_1 $ for all $ i \geqslant N $. Therefore, for $ i \geqslant N $, we obtain from (3.29) that
as $ \ i \rightarrow \infty $. Hence $ \lim\limits_{i \to \infty} \|x_{i+1}-x_i\| = 0 $. For $ u \in EP(f, \mathcal{K}) $, we have
and from (3.14), we have
This implies that
By our condition and (3.31), we obtain
Since $ \{x_i\} $ is bounded, that is, there exits a subsequence $ \{x_{i_k}\} $ of $ \{x_i\} $ such that $ x_{i_k} \rightharpoonup x^* $ for some $ x^* \in H $. From (3.31) and (3.35), we get $ w_{i_k} \rightharpoonup x^* $ and $ y_{i_k} \rightharpoonup x^* $ as $ k \rightarrow \infty $.
By the definition of $ z_i $ and (3.6), we have
It follows from $ \{z_{i_k}\} $ is bounded, $ 0 < \lambda_{i_k} \leqslant \lambda < \frac{1}{2\max\{c_1, c_2\}} $ and Condition 2.6 (A3) that $ 0 \leqslant \limsup \limits_{k \to \infty} f(y_{i_k}, y) \leqslant f(x^*, y) $ for all $ y \in H $. This implies that $ f(x^*, y) \geqslant 0 $ for all $ y \in \mathcal{K} $, this shows that $ x^* \in EP(f, \mathcal{K}) $. Then, we have
by $ u = P_{EP(f, \mathcal{K})} V(u) $. Applying (3.36) to the inequality (3.30) with Lemma 2.4, we can conclude that $ x_i \rightarrow u = P_{EP(f, \mathcal{K})} V(u) $ as $ i \rightarrow \infty $.
Case 2. In another case of $ \{u_i\} $, we let $ \phi : \mathbb{N} \rightarrow \mathbb{N} $ be the map defined for all $ i \geqslant i_0 $ (for some $ i_0 \in \mathbb{N} $ large enough) by
Clearly, $ \phi(i) $ is a non-decreasing sequence such that $ \phi(i) \rightarrow \infty $ for $ i \rightarrow \infty $ and $ u_{\phi(i)}+t_{\phi(i)} \leqslant u_{\phi(i)+1} $ for all $ i \geqslant i_0 $. Hence, similar to the proof of Case 1, we therefore obtain from (3.31) that
for some constant $ M_1, M_2 > 0 $. Thus
By the same proof of Case 1, one also derive
Again observe that for $ j \geqslant 0 $ by (3.29), we have $ u_{j+1} < u_j+t_j $ when $ x_j \notin \Omega = \{x \in H : \langle x-x_0, x-u \rangle \leqslant 0\} $. Hence $ x_{\phi (i)} \in \Omega $ for all $ i \geqslant i_0 $ since $ u_{\phi (i)}+t_{\phi (i)} \leqslant u_{\phi (i)+1} $. Sine $ \{x_{\phi (i)}\} $ is bounded, there exist subsequence $ \{x_{\phi(i)}\} $ of $ \{x_{\phi (i)}\} $ which converges weakly to some $ x^* \in \mathcal{H} $. As $ \Omega $ is a closed and convex set, it is then weakly closed and so $ x^* \in \Omega $. Using (3.40), one can see as in Case 1 that $ z_{\phi (i)} \rightharpoonup x^* $ and $ x^* \in EP(f, \mathcal{K}) $ contains $ u $ as its only element. We therefore have $ x^* = u $. Furthermore,
due to $ x_{\phi (i)} \in \Omega $. This gives
Hence
By definition, $ u_{\phi (i) +1} $, we have
By our Condition 3.2 (i), (3.39) and (3.41), we obtain $ \lim\limits_{i \to \infty} u_{\phi(i)+1} = 0 $. We next show that we actually have $ \lim\limits_{i \to \infty} u_i = 0 $. To this end, first observe that, for $ i \geqslant i_0 $, one has $ u_i+t_i \leqslant u_{\phi (i) + 1} $ if $ i \neq \phi(i) $. It follows that for all $ i \geqslant i_0 $, we have $ u_i \leqslant \max\{u_{\phi(i)}, u_{\phi (i) + 1} \} = u_{\phi (i) + 1} \rightarrow 0 $, since $ \lim\limits_{i \to \infty} t_i = 0 $, hence $ \limsup \limits_{i \to \infty} u_i \leqslant 0 $. On the other hand, Lemma 3.6 implies that $ \liminf_{i \to \infty} u_i \geqslant 0 $. Hence, we obtain $ \lim\limits_{i \to \infty} u_i = 0 $. Consequently, the boundedness of $ \{x_i\} $, $ \lim\limits_{i \to \infty} \alpha_i = 0 $, and (3.29) show that $ \|x_i - x_{i+1}\| \rightarrow 0 $, as $ i \rightarrow \infty $. Hence the definition of $ u_i $ yields $ (\|x_{i+1}-u\|^2 - \theta_i \|x_i-u\|^2) \rightarrow 0 $, as $ i \rightarrow \infty $. By using Lemma 3.7, we obtain the desired conclusion immediately.
Setting $ V(x) = x_0, \ \forall x\in \mathcal{H} $, then we obtain the following modified Halpern inertial extragradient algorithm for EPs:
From Algorithm 3.1, the convergence depends on the parameter $ \{\lambda_i\} $ with the condition $ 0 < \lambda_{i}\leq\lambda < \frac{1}{2\max\{c_1, c_2\}} $. So, the step size $ \{\lambda_i\} $ can be considered in many ways. Applying step size concept of Shehu et al. [22], we then obtain the following modified viscosity type inertial extragradient stepsize algorithm for EPs:
Remark 3.11. (i) Since $ V(x) = x_0, \ \forall x\in \mathcal{H} $ is contraction, thus the modified Halpern inertial extragradient Algorithm 3.9 converges strongly to $ x^* = P_{EP(f, \mathcal{K})}x_0 $ with Conditions 2.6 and 3.2;
(ii) Since the step size $ \{\lambda_i\} $ in Algorithm 3.10 is a monotonically decreasing sequence with lower bound $ \min\{\lambda_1, \frac{1}{2\max\{c_1, c_2\}}\} $ [22], thus Algorithm 3.10 converges strongly to the solution $ u = P_{EP(f, \mathcal{K})} V(u) $ by Theorem 3.8.
We now give an example in infinitely dimensional spaces $ L_2[0, 1] $ to support the main theorem.
Example 3.12. Let $ V : L_2[0, 1] \rightarrow L_2[0, 1] $ be defined by $ V(x(t)) = \frac{x(t)}{2} $ where $ x(t) \in L_2[0, 1] $. We can choose $ x_0(t) = \frac{sin(t)}{2} $ and $ x_1(t) = sin(t) $. The stopping criterion is defined by $ \|x_i-x_{i-1}\| < 10^{-2} $.
We set the following parameters for each algorithm, as seen in Table 1.
Next, we compare the performance of Algorithms 3.1, 3.9 and 3.10. We obtain the results as seen in Table 2.
From Figure 1, we see that the performance of Algorithm 3.10 is better than Algorithms 3.1 and 3.9.
4.
Application to data classification problem
According to the International Diabetes Federation (IDF), there are approximately 463 million people with diabetes worldwide, and it is estimated that by 2045 there will be 629 million people with diabetes. In Thailand, the incidence of diabetes is continuously increasing. There are about 300,000 new cases per year, and 3.2 million people with diabetes are registered in the Ministry of Public Health's registration system. They are causing huge losses in health care costs. Only one disease of diabetes causes the average cost of treatment costs up to 47,596 million baht per year. This has led to an ongoing campaign about the dangers of the disease. Furthermore, diabetes mellitus makes additional noncommunicable diseases that present a high risk for the patient, as they easily contact and are susceptible to infectious diseases such as COVID-19 [23]. Because it is a chronic disease that cannot be cured. There is a chance that the risk of complications spreading to the extent of the loss of vital organs of the body. By the International Diabetes Federation and the World Health Organization (WHO) has designated November 14 of each year as World Diabetes Day to recognize the importance of this disease.
In this research, we used the PIMA Indians diabetes dataset which was downloaded from Kaggle (https://www.kaggle.com/uciml/pima-indians-diabetesdatabase) and is available publicly on UCI repository for training processing by our proposed algorithm. The dataset contains 768 pregnant female patients which 500 were non-diabetics and 268 were diabetics. There were 9 variables present inside the dataset; eight variables contain information about patients, and the 9th variable is the class predicting the patients as diabetic and nondiabetic. This dataset contains the various attributes that are Number of times pregnant; Plasma glucose concentration at 2 Hours in an oral glucose tolerance test (GTIT); Diastolic Blood Pressure (mm Hg); Triceps skin fold thickness (mm); 2-Hour Serum insulin (lh/ml); Body mass index [weight in kg/(Height in m)]; Diabetes pedigree function; Age (years); Binary value indicating non-diabetic /diabetic. For the implementation of machine learning algorithms, 614 were used as a training dataset and 154 were used as a testing training dataset by using 5-fold cross-validation [12]. For benchmarking classifier, we consider the following various methods which have been proposed to classify diabetes (see Table 3):
We focus on extreme learning machine (ELM) proposed by Huang et al. [9] for applying our algorithms to solve data classification problems. It is defined as follows:
Let $ E : = \{(\mathbf{x}_n, \mathbf{t}_n) : \mathbf{x}_n\in\mathbb{R}^n, \mathbf{t}_n\in\mathbb{R}^m, n = 1, 2, ..., P\} $ be a training set of $ P $ distinct samples where $ \mathbf{x}_n $ is an input training data and $ \mathbf{t}_n $ is a training target. The output function of ELM for single-hidden layer feed forward neural networks (SLFNs) with $ M $ hidden nodes and activation function $ U $ is
where $ w_j $ and $ b_j $ are parameters of weight and finally the bias, respectively and $ \Theta_j $ is the optimal output weight at the $ j $-th hidden node. The hidden layer output matrix $ H $ is defined as follows:
To solve ELM is to find optimal output weight $ \Theta = [\Theta^T_1, ..., \Theta^T_M]^T $ such that $ H \Theta = T $, where $ T = [\mathbf{t}^T_1, ..., \mathbf{t}^T_P]^T $ is the training data. In some cases, finding $ \Theta = H^{\dagger} T $, where $ H^{\dagger} $ is the Moore-Penrose generalized inverse of $ H $. However, if $ H $ does not exist, then, finding such a solution $ \Theta $ through convex minimization can overcome such difficulty.
In this section, we process some experiments on the classification problem. This problem can be seen as the following convex minimization problem:
where $ \lambda $ is a regularization parameter. This problem is called the least absolute shrinkage and selection operator (LASSO) [26]. Setting $ f(\Theta, \zeta) = \langle H^T(H\Theta-T), \zeta-\Theta\rangle $ and $ V(x) = Cx $ where $ C $ is constant in (0, 1).
The binary cross-entropy loss function along with sigmoid activation function for binary classification calculates the loss of an example by computing the following average:
where $ \hat{y}_j $ is the $ j $-th scalar value in the model output, $ y_j $ is the corresponding target value, and $ K $ is the number of scalar values in the model output.
In this work, the performance of machine learning techniques for all classes is accurately measured. The accuracy is calculated by adding the total number of correct predictions to the total number of predictions. The performance parameter calculation of precision and recall are measured. The formulation of three measures [27] are defined as follow:
where a confusion matrix for original and predicted classes are shown in terms of $ \mathsf{TP} $, $ \mathsf{TN} $, $ \mathsf{FP} $, and $ \mathsf{FN} $ are the True Positive, True Negative, False Positives, and False Negatives, respectively. Similarly, $ \mathsf{P} $ and $ \mathsf{N} $ are the Positive and Negative population of Malignant and Benign cases, respectively.
For starting our computation, we set the activation function as sigmoid, hidden nodes $ M = 160 $, regularization parameter $ \lambda = 1\times10^{-5} $, $ \theta_i = 0.3 $, $ \alpha_i = \frac{1}{i+1} $, $ \tau = 0.5 $, $ \mu = 0.2 $ for Algorithms 3.1, 3.9 and 3.10 and $ C = 0.9999 $ for Algorithms 3.1 and 3.10. The stopping criteria is the number of iteration 250. We obtain the results of the different parameters $ S $ when $ \lambda_i = \frac{S}{max(eigenvalue(A^{T}A))} $ for Algorithms 3.1, 3.9 and the different parameters $ \lambda_1 $ for Algorithm 3.10 as seen in Table 4.
We can see that $ \lambda_i = \lambda_1 = 0.99 $ greatly improves the performance of Algorithms 3.1, 3.9 and 3.10. Therefore, we choose it as the default inertial parameter for next computation.
We next choose $ \lambda_i = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \alpha_i = \frac{1}{i+1} $, $ \tau = 0.5 $ for Algorithms 3.1 and 3.9 and $ C = 0.9999 $ for Algorithm 3.1 with $ \lambda_1 = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \alpha_i = \frac{1}{i+1} $, $ \tau = 0.5 $, $ C = 0.9999 $, and $ \mu = 0.2 $ for Algorithm 3.10. The stopping criteria is the number of iteration 250. We consider the different initialization parameter $ \theta $ where
where $ N $ is a number of iterations that we want to stop. We obtain the numerical results as seen in Table 5.
We can see that $ \theta = 0.3 $ greatly improves the performance of Algorithms 3.1, 3.9 and 3.10. Therefore, we choose it as the default inertial parameter for next computation.
We next set $ \lambda_i = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ \tau = 0.5 $ for Algorithms 3.1 and 3.9 and $ C = 0.9999 $ for Algorithm 3.1 with $ \lambda_1 = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ \tau = 0.5 $, $ C = 0.9999 $, and $ \mu = 0.2 $ for Algorithm 3.10. The stopping criteria is the number of iteration 250. We consider the different initialization parameter $ \alpha_i $. The numerical results are shown in Table 6.
We can see that $ \alpha_i = \frac{1}{10i+1} $ greatly improves the performance of Algorithm 3.1, $ \alpha_i = \frac{1}{10i^2+1} $ greatly improves the performance of Algorithm 3.9, and $ \alpha_i = \frac{1}{i^2+1} $ greatly improves the performance of Algorithm 3.10. Therefore, we choose it as the default inertial parameter for next computation.
We next calculate the numerical results by setting $ \lambda_i = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ \alpha_i = \frac{1}{10i+1} $ and $ C = 0.9999 $ for Algorithm 3.1, $ \lambda_i = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ \alpha_i = \frac{1}{10i^2+1} $ for Algorithm 3.9 and $ \lambda_1 = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ C = 0.9999 $, $ \alpha_i = \frac{1}{i^2+1} $, and $ \mu = 0.2 $ for Algorithm 3.10. The stopping criteria is the number of iteration 250. We consider the different initialization parameter $ \tau $. The numerical results are shown in Table 7.
We can see that $ \tau = 0.5 $ greatly improves the performance of Algorithms 3.1, 3.9 and 3.10. Therefore, we choose it as the default inertial parameter for next computation.
We next calculate the numerical results by setting $ \lambda_i = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ \tau = 0.5 $ for Algorithms 3.1 and 3.9 and $ \alpha_i = \frac{1}{10i+1} $ for Algorithm 3.1 with $ \alpha_i = \frac{1}{10i^2+1} $ for Algorithm 3.9 and $ \lambda_1 = \frac{0.99}{max(eigenvalue(A^{T}A))} $, $ \theta_i = 0.3 $, $ \alpha_i = \frac{1}{i^2+1} $, $ \tau = 0.5 $, and $ \mu = 0.2 $ for Algorithm 3.10. The stopping criteria is the number of iteration 250. We obtain the results of the different parameters $ C $ when $ V(x) = Cx $ for Algorithms 3.1 and 3.10 as seen in Table 8.
From Tables 3–8, we choose the parameters for Algorithm 3.1 to compare the exist algorithms from the literature. The following Table 9 shows choosing the necessary parameters of each algorithm.
For comparison, We set sigmoid as an activation function, hidden nodes $ M = 160 $ and regularization parameter $ \lambda = 1 \times 10^{-5} $.
Table 10 shows that Algorithm 3.1 has the highest efficiency in precision, recall, and accuracy. It also has the lowest number of iterations. It has the highest probability of correctly classifying tumors compared to algorithms examinations. We present the training and validation loss with the accuracy of training to show that Algorithm 3.1 has no overfitting in the training dataset.
From Figures 2 and 3, we see that our Algorithm 3.1 has good fit model this means that our Algorithm 3.1 suitably learns the training dataset and generalizes well to classification the PIMA Indians diabetes dataset.
5.
Conclusions
In general, screening for diabetes in pregnancy we use The American College of Obstetricians and Gynecologists (ACOG) recommendations. The accuracy of our method is $ 80.03\% $ and high accuracy may use for predict correctly diabetes in pregnancy in the future.
In this paper, we introduce a modified extragradient method with inertial extrapolation step and viscosity-type method to solve equilibrium problems of pseudomonotone bifunction operator in real Hilbert spaces. We then prove the strong convergence theorem of the proposed algorithm under the assumption that the bifunction satisfies the Lipchitz-type condition. Moreover, we show choosing stepsize parameter $ \{\lambda_i\} $ in many ways, this shows that our algorithm is flexible using, see in Algorithms 3.1 and 3.10. Finally, we show our algorithms are better performance than existing algorithms to solve the diabetes mellitus classification in machine learning.
Acknowledgments
This research was also supported by Fundamental Fund 2022, Chiang Mai University and the NSRF via the Program Management Unit for Human Resources and Institutional Development, Research and Innovation (grant number B05F640183). W. Cholamjiak would like to thank National Research Council of Thailand (N42A650334) and Thailand Science Research and Innovation, the University of Phayao (FF65-UOE).
Conflict of interest
The authors declare no conflict of interest.