1.
Introduction
In this paper, let H represent a real Hilbert space equipped with the inner product ⟨⋅,⋅⟩ and its corresponding norm ||⋅||. We denote the sets of real numbers and positive integers by R and N, respectively.
A fixed point problem involves finding
where U is a self-mapping on H. We denote the set of solutions to the fixed point problem of the mapping U by Fix(U). Iterative methods are crucial in fixed point theory, especially when dealing with nonlinear mappings. The Picard iteration method [1], developed by Charles Émile Picard in the late 19th century, has long been a cornerstone of mathematical analysis, particularly in solving differential equations. It is highly effective for contractive mappings, where it guarantees convergence to a unique solution by repeatedly applying the function to an initial guess. However, its effectiveness is limited when dealing with nonexpansive mappings, where convergence is not always assured. To address this limitation, Mann [2] introduced the Mann iteration method in 1953, broadening the scope of the Picard method to include nonexpansive mappings. Since then, the Mann iteration has become an essential tool in fixed point theory and numerical analysis, particularly in cases where traditional methods might struggle to converge. In 2013, Khan [3] introduced the Picard-Mann iteration method, a hybrid approach that combines elements of both the Picard and Mann iterations. This method was designed to enhance the convergence of traditional techniques, particularly for certain types of mappings where standard methods can be slow or less effective. By blending the simplicity of Picard's method with the flexibility of Mann's, the Picard-Mann iteration offers a more efficient way to find fixed points, especially in more complex or nonexpansive scenarios. This hybrid approach not only expands its applicability but also has the potential to accelerate convergence compared to using either method alone. Khan further demonstrated both weak and strong convergence results for nonexpansive mappings in uniformly convex Banach spaces using the Picard-Mann iteration method.
A variational inclusion problem (VIP) is typically formulated as follows:
where X:H→H is a single-valued mapping and Y:H→2H is a multivalued mapping. The set of all solutions to the VIP is denoted by (X+Y)−1(0). The study of solving the VIP has garnered significant attention among researchers in optimization and functional analysis. Numerous researchers have focused on developing efficient algorithms for addressing this problem. Two methods that have gained considerable popularity are the forward-backward splitting method [4,5] and the Tseng splitting method [6]. Various researchers have modified and adapted the forward-backward splitting method to solve the VIP (see [7,8,9]). The Tseng splitting method is another approach that has been refined and applied to solve the VIP (see [10,11,12]). Both methodologies are extensively referenced in the academic literature on solving the VIP, owing to their efficiency and flexibility in application to various problem formulations. Researchers have further developed these methods, enhancing convergence rates, computational error tolerance, and the capacity to accommodate additional problem constraints. In 2018, Gibali and Thong [13] developed a modified iterative method aimed at solving the VIP. They enhanced existing techniques by adjusting the step size rule based on Tseng's method and the Mann iteration method, to improve both efficiency and applicability. The method is detailed as follows:
Under the right conditions, this approach demonstrates strong convergence and provides practical benefits, making it particularly valuable in real-world scenarios. Prior to this, in 1964, Polyak [14] proposed the inertial extrapolation technique, known as the heavy ball method, to accelerate the convergence of iterative algorithms. In 2020, Padcharoen et al. [11] introduced the following splitting method that builds on Tseng's approach and incorporates the inertial extrapolation technique for solving the VIP.
While weak convergence was confirmed under typical conditions, the method has also proven effective in practical applications, such as image deblurring and recovery.
In this study, we are interested in investigating the fixed point problem and the VIP, that is,
where Y:H→2H is a maximal monotone mapping, X:H→H is ℓ-Lipschitz continuous and a monotone mapping, and U:H→H is quasi-nonexpansive and a demiclosed mapping. We denote the set of solutions to this problem by Ψ. Recently, Mouktonglang et al. [15] introduced the following method for solving fixed points of demicontractive mapping and the VIP in the case where X=∇f and Y=∂g, where f:H→R and g:H→R∪{+∞} are two proper, lower semicontinuous, and convex functions.
The authors proved a weak convergence theorem under specific conditions using this method and demonstrated it with a numerical example in signal recovery. The approach is inspired by the proximal gradient technique, double inertial steps, and Mann iteration.
In this article, we present a novel algorithm that demonstrates weak convergence to a common solution for fixed point problems involving quasi-nonexpansive mappings and variational inclusion problems within the context of real Hilbert spaces, under reasonable assumptions. The presented algorithm is given along with essential assumptions in Section 3. Additionally, in Section 4, we employ this algorithm in conjunction with an extreme learning machine for data classification, specifically to predict osteoporosis risk.
2.
Preliminaries
We gather essential definitions and lemmas needed to establish our main results. We denote weak and strong convergence as ⇀ and →, respectively. Assume a,b∈H, and we have
for any γ∈R.
Definition 2.1. A self-mapping X:H→H is called
(i) ℓ-Lipschitz continuous if there is ℓ>0 such that ‖Xa−Xb‖≤ℓ‖a−b‖ for all a,b∈H;
(ii) nonexpansive if X is 1-Lipschitz continuous;
(iii) quasi-nonexpansive if Fix(X) is nonempty and ‖Xa−r‖≤‖a−r‖ for all a∈H and r∈Fix(X);
(iv) demiclosed if for any sequence {an}⊂H, the following implication holds:
Definition 2.2. Let Y:H→2H be a multivalued mapping. Then Y is said to be
(i) monotone if for all (a,c),(b,d)∈graph(Y) (the graph of mapping Y), ⟨c−d,a−b⟩≥0;
(ii) maximal monotone if for every (a,c)∈H×H, ⟨c−d,a−b⟩≥0 for all (b,d)∈graph(Y) if and only if (a,c)∈graph(Y).
Lemma 2.3. [16] Let X:H→H be a mapping and Y:H→2H a maximal monotone mapping. If Tμ:=(I+μY)−1(I−μX) with μ>0, then Fix(Tμ)=(X+Y)−1(0).
Lemma 2.4. [17] If Y:H→2H is a maximal monotone mapping and X:H→H is a Lipschitz continuous monotone mapping, then the sum X+Y is also maximal monotone.
3.
Main results
To analyze the convergence, we assume the following conditions.
(C1) Y:H→2H is a maximal monotone mapping.
(C2) X:H→H is ℓ-Lipschitz continuous and a monotone mapping.
(C3) U:H→H is quasi-nonexpansive and a demiclosed mapping.
(C4) Ψ is nonempty.
The following algorithm will be employed for Theorem 3.2.
Remark 3.1. Suppose that U and X are mappings on H and the item (C1) holds. According to Lemma 2.3, if Ubn=bn=cn=dn in Algorithm 4, then it is easy to show that dn∈Ψ.
Next, we are ready to prove Theorem 3.2.
Theorem 3.2. Let the sequence {an} be generated by Algorithm 4 satisfying the items (C1)–(C4). Assume that the following conditions are satisfied:
(C5) 0<lim infn→∞αn≤lim supn→∞αn<1.
(C6) 0<lim infn→∞μn≤lim supn→∞μn<1ℓ.
(C7) ∞∑n=1|θn|‖an−an−1‖<∞.
(C8) ∞∑n=1|δn|‖an−1−an−2‖<∞.
Then, {an} converges weakly to a solution of Ψ.
Proof. Let ˜a∈Ψ. By the condition (C6), there are n0∈N,μ>0, and ˉμ<1ℓ such that μ≤μn≤ˉμ for all n≥n0. We will now establish the following claims.
Claim 1. For any n∈N,
By using the definition of cn, we have
Thus, we can write
where yn∈Ycn. Since X+Y is maximal monotone, we obtain
implying that
Claim 2. For each n≥n0,
From (2.1) and the fact that U is a quasi-nonexpansive mapping, we have
Using Claim 1, we get
By the Lipschitz continuity of X, we have that
Thus, Claim 2 is established.
Claim 3. limn→∞‖an−˜a‖=limn→∞‖dn−˜a‖=limn→∞‖en−˜a‖, where en=αndn+(1−αn)Ubn.
Since U is a quasi-nonexpansive mapping and using Claim 2, we have
Applying this to Lemma 1 in [18] with the conditions (C7) and (C8), we derive that the sequence {‖an−˜a‖} converges and hence limn→∞‖an−˜a‖=limn→∞‖dn−˜a‖=limn→∞‖en−˜a‖. In particular, {an},{dn}, and {en} are bounded.
Claim 4. limn→∞‖dn−Ubn‖=0.
From (2.2), U is a quasi-nonexpansive mapping, and using Claim 2, we have
This together with Claim 3 and the condition (C5) implies that limn→∞‖dn−Ubn‖=0.
Claim 5. limn→∞‖Ubn−bn‖=0.
Again, using Claim 2, we get, for all n≥n0,
Thus, we obtain from Claim 4 and 1−(ˉμℓ)2>0 that
By the Lipschitz continuity of X, it follows that
which by (3.1) yields
Combining (3.2) and Claim 4, we have that
Claim 6. limn→∞‖an−bn‖=limn→∞‖an−cn‖=0.
By using the the definition of dn, the following inequalities are obtained:
and
Combining (3.1)–(3.2) and the conditions (C7)–(C8), we deduce that Claim 6 is true.
Claim 7. Every weak sequential cluster point of {an} belongs to Ψ.
Let a∗ be a weak sequential cluster point of {an}. Then ank⇀a∗ as k→∞ for some subsequence {ank} of {an}. This implies by Claim 6 that bnk⇀a∗ and cnk⇀a∗ as k→∞. It follows, from the fact that U is a demiclosed mapping and Claim 5, that a∗∈Fix(U). Next, we show that a∗∈(X+Y)−1(0). Let (v,u)∈graph(X+Y), that is, u−Xv∈Yv. It is implied by the definition of cn that 1μnk(dnk−cnk−μnkXdnk)∈Ycnk. By the maximal monotonicity of Y, we have
Thus, by the monotonicity of X, we get
This result follows from the Lipschitz continuity of X and (3.1), giving us
which, combined with the maximal monotonicity of X+Y, implies that a∗∈(X+Y)−1(0). Hence, a∗∈Ψ. Finally, by Opial's lemma in [16], we conclude that the sequence {an} converges weakly to a point in Ψ. □
4.
Application to osteoporosis risk prediction
Osteoporosis is a major global health issue, particularly affecting the elderly population. It leads to weakened bones and a higher risk of fractures, which can result in significant morbidity, loss of mobility, and increased mortality rates. Machine learning (ML) models can analyze large datasets containing complex variables like demographics, lifestyle, genetics, and bone density, identifying individuals at high risk of osteoporosis earlier. This allows for timely interventions, preventing fractures and other complications.
In this section, we implement our newly proposed algorithm as the optimizer for an extreme learning machine (ELM), originally introduced by Huang et al. [19], to assess osteoporosis risk using a comprehensive dataset from Kaggle*. This dataset provides a detailed overview of health factors that contribute to osteoporosis, including demographic details, lifestyle choices, medical history, and bone health indicators. Its comprehensive nature facilitates the development of machine learning models that can accurately identify individuals at high risk for osteoporosis. By analyzing key factors such as age, gender, hormonal changes, and lifestyle habits, our research significantly advances osteoporosis management and prevention strategies. This predictive capability enables early diagnosis, supporting timely interventions that minimize fracture risk, enhance patient outcomes, and optimize the allocation of healthcare resources. Additionally, the integration of machine learning models with our novel optimizer improves prediction accuracy, representing a significant innovation in the field. Table 1 provides readers with an understanding of the dataset's structure, including a detailed description of its components.
*https://www.kaggle.com/datasets/amitvkulkarni/lifestylefactors-influencing-osteoporosis
Osteoporosis occurs when bone density significantly decreases, resulting in fragile and brittle bones. It is diagnosed when the T-score is -2.5 or lower, based on a bone density scan (DEXA scan), see Figure 1. At this stage, bones have become considerably more porous and weaker, making them prone to fractures. Osteoporosis typically progresses from osteopenia, an intermediate stage characterized by lower-than-normal bone density but not as severe as osteoporosis. Factors such as age, hormonal changes (e.g., menopause), nutritional deficiencies (e.g., calcium or vitamin D), and lack of physical activity can contribute to the development of osteoporosis.
In the process of creating our extreme learning machine (ELM), we consider N distinct samples, where the training set S:={(an,tn):xn∈Rn,tn∈Rm,n=1,2,…,N} consists of input data xn and corresponding target outputs tn. The output function of an ELM for a standard single-layer feedforward network (SLFN) with M hidden nodes is mathematically represented as:
where wi is a randomly initialized weight, and bi is a randomly initialized bias for the i-th hidden node. The goal is to find the optimal output weights βi. The above system of linear equations can be represented in matrix form as T=Hβ, where
where H is the hidden layer output matrix, T=[tT1,…,tTN]T is the target output matrix, and β=[βT1,…,βTM]T is the vector of optimal output weights. These optimal weights can be computed as β=H†T, where H† is the Moore-Penrose generalized inverse of H, though finding H† may be challenging in practice. Therefore, obtaining a solution β via convex minimization can help address this challenge. The least squares problem is particularly effective for this, and regularization is a commonly employed technique in machine learning and statistics to mitigate overfitting, enhance model generalization, and ultimately improve performance in classification tasks. We conducted a series of experiments on a classification problem, explicitly using the well-known least absolute shrinkage and selection operator (LASSO) method [20]. The detailed descriptions of these experiments are provided below: For λ>0,
By applying Algorithm 4 to solve the problem (4.1), we are setting Xβ≡∇(12‖Hβ−T‖22) and Yβ≡∂(λ‖β‖1) with λ=0.01.
We evaluate the performance of the classification algorithms using four evaluation metrics: Accuracy, precision, recall, and F1-score [21]. These metrics are defined as follows:
In these formulas, TP represents True Positives, TN True Negatives, FP False Positives, and FN False Negatives.
Additionally, we used binary cross-entropy loss [22] to evaluate the model's ability to distinguish between two classes in binary classification tasks. This loss is computed as the average:
where ˉφi represents the predicted probability for the i-th instance, φi is the corresponding true label, and N is the total number of instances.
Next, as illustrated in Figure 2, we partition the dataset using a 5-fold cross-validation approach. In each fold, 80% of the data is used for training (highlighted in purple), and 20% is allocated for validation (highlighted in green). This ensures that every subset of the data is used for validation exactly once, while the rest is used for training.
For the comparison of Algorithms 1–3, we set all parameters as follows: The number of hidden nodes is M=500, with randomly initialized weights wi in the range [−50,50] and biases bi in the range [−5,5]. Specifically, we define: θn=0.0004 for Algorithm 2:
with
for Algorithm 3, and
with
for Algorithm 4 where N denotes the iteration number at which we decide to stop the algorithm. For further details on the parameter settings, please refer to Table 2.
The results for each algorithm, evaluated on Training Box 1, are presented in Table 3.
Table 4 displays the performance outcomes for each algorithm, assessed using Training Box 2.
The performance outcomes for each algorithm, after being tested on Training Box 3, are detailed in Table 5.
The performance outcomes for each algorithm, after being tested on Training Box 4, are detailed in Table 6.
The performance results for each algorithm, evaluated on Training Box 5, are provided in Table 7.
Remark 4.1. 1) From Tables 3–7, the performance of Algorithms 1–4 was evaluated using a 5-fold cross-validation scheme, with each training box serving as the validation set in turn:
(ⅰ) Training Box 1: Algorithm 4 achieved the highest accuracy (82.05%) with notable precision (91.79%) and F1-score (83.64%).
(ⅱ) Training Box 2: Algorithms 3 and 4 tied for the highest accuracy (85.93%), with Algorithm 4 showing marginally better CPU time and recall (81.61%).
(ⅲ) Training Box 3: Algorithm 4 again performed best, achieving an accuracy of 84.40% and a high precision (90.81%).
(ⅳ) Training Box 4: Algorithm 4 had the best accuracy (82.35%) and precision (93.36%).
(ⅴ) Training Box 5: Algorithm 4 tied for the highest accuracy (85.32%), and showed the best precision (94.89%) and marginally better number of iterations.
In general, Algorithm 4 consistently demonstrated strong performance in accuracy, precision, and F1-score across most of the training boxes.
2) In practical applications of the proposed algorithm to machine learning problems, the challenge of unknown or difficult-to-estimate Lipschitz constants is mitigated by the finite nature of the feature set. In such cases, the Lipschitz constant can be approximated more quickly due to the boundedness of the feature space. Moreover, the efficiency and convergence of the proposed algorithm are not significantly impacted by the limitations associated with estimating the Lipschitz constant. This is demonstrated through the results presented in Tables 3–7, where the algorithm achieves effective performance and convergence despite potential uncertainties in the Lipschitz parameter.
From Figures 3–7, the accuracy and loss plots for Algorithm 4 across all training boxes show consistent trends, with training and validation accuracy remaining relatively close to each other throughout the iterations, indicating minimal overfitting. The training and validation loss curves also exhibit similar behavior, steadily decreasing over time and stabilizing. However, in some figures (such as Figures 3 and 6), the slight divergence between the training and validation loss can be observed in later iterations, suggesting potential signs of mild overfitting. The model maintains a good balance between training and validation performance, demonstrating that regularization and parameter choices prevent significant overfitting. The performance trends suggest that Algorithm 4 generalizes well to the validation data without overfitting the training set.
From Figure 8, we see that the ROC curves illustrate the model's classification performance, where higher AUC values reflect stronger class separation.
5.
Conclusions
We proposed a novel algorithm (Algorithm 4) to solve variational inclusion problems and fixed point problems involving quasi-nonexpansive mappings in a real Hilbert space. Our main theorem establishes the weak convergence of the proposed algorithm under certain conditions. We also applied the algorithm in conjunction with an extreme learning machine to the problem of data classification, specifically to predict osteoporosis risk. Our algorithm achieves an accuracy of over 82%, a precision of over 91%, a recall of 76%, and an F1-score of over 83% across all training boxes, which demonstrate the effectiveness of the algorithm we developed.
Data availability
The data are available on the Kaggle website (https://www.kaggle.com/datasets/amitvkulkarni/lifestylefactors-influencing-osteoporosis).
Institutional Review Board statement
This study was conducted in accordance with the Declaration of Helsinki, the Belmont Report, CIOMS guideline for the International Conference on Harmonization in Good Cinical Pratice, or ICH-GCP or 45CFR 46.101(b), and with approval from the Ethics Committee and Institutional Review Board of the Faculty of the University of Phayao (Institutional Review Board (IRB) approval, IRB Number: HREC-UP-HSST 1.102867).
Author contributions
Raweerote Suparatulatorn: Writing – original draft, formal analysis; Wongthawat Liawrungrueang: Writing – review & editing, data curation; Thanasak Mouktonglang: Writing – review & editing, project administration; Watcharaporn Cholamjiak: Writing – review & editing, software. All authors have read and agreed to the published version of the manuscript.
Use of Generative-AI tools declaration
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
Acknowledgments
This research work was partially supported by the CMU Proactive Researcher, Chiang Mai University [grant number 780/2567]. W. Liawrungrueang and W. Cholamjiak would like to thank the Thailand Science Research and Innovation Fund (Fundamental Fund 2025, Grant No. 5025/2567) and the University of Phayao.
Conflict of interest
The authors declare no conflicts of interest.