The Hestenes-Stiefe (HS) conjugate gradient method is very effective in resolving larger-scale sophisticated smoothing optimization tasks due to its low computational requirements and high computational efficiency. Additionally, the algorithm has been employed in practical applications to address image restoration and machine learning issues. In this paper, the authors proposed an improved Hestenes-Stiefe conjugate gradient algorithm having characteristics like: ⅰ) The algorithm depicts the decreasing features and trust region properties free of conditionalities. ⅱ) The algorithm satisfies global convergence. ⅲ) The algorithm can be applied to tackle the image restoration problem, monotone nonlinear equations, and machine learning problems. Numerical results revealed that the proffered technique is a competitive method.
Citation: Gonglin Yuan, Minjie Huang. An efficient modified HS conjugate gradient algorithm in machine learning[J]. Electronic Research Archive, 2024, 32(11): 6175-6199. doi: 10.3934/era.2024287
The Hestenes-Stiefe (HS) conjugate gradient method is very effective in resolving larger-scale sophisticated smoothing optimization tasks due to its low computational requirements and high computational efficiency. Additionally, the algorithm has been employed in practical applications to address image restoration and machine learning issues. In this paper, the authors proposed an improved Hestenes-Stiefe conjugate gradient algorithm having characteristics like: ⅰ) The algorithm depicts the decreasing features and trust region properties free of conditionalities. ⅱ) The algorithm satisfies global convergence. ⅲ) The algorithm can be applied to tackle the image restoration problem, monotone nonlinear equations, and machine learning problems. Numerical results revealed that the proffered technique is a competitive method.
[1] | J. M. Ortega, W. C. Rheinboldt, Iterative Solution of Nonlinear Equation in Sseveral Variables, Society for Industrial and Applied Mathematics, Philadelpia, PA, USA, 598 (1970). |
[2] | W. C. Rheinboldt, Methods for Solving Systems of Nonlinear Equations, Society for Industrial and Applied Mathematics, 1998. |
[3] | L. Qi, J. Sun, A nonsmooth version of Newton's method, Math. Program., 58 (1993), 353–367. https://doi.org/10.1007/BF01581275 doi: 10.1007/BF01581275 |
[4] | J. E. Dennis, J. J. Moré, A characterization of superlinear convergence and its application to quasi-Newton methods, Math. Comput., 28 (1974), 549–560. https://doi.org/10.1090/S0025-5718-1974-0343581-1 doi: 10.1090/S0025-5718-1974-0343581-1 |
[5] | J. E. Jr. Dennis, J. J. Moré, quasi-Newton methods, motivation and theory, SIAM Rev., 19 (1977), 46–89. https://doi.org/10.1137/1019005 doi: 10.1137/1019005 |
[6] | X. Tong, L. Qi, Y. Yang, The Lagrangian globalization method for nonsmooth constrained equations, Comput. Optim. Appl., 33 (2006), 89–109. https://doi.org/10.1007/s10589-005-5960-9 doi: 10.1007/s10589-005-5960-9 |
[7] | W. Zhou, D. Li, A globally convergent BFGS method for nonlinear monotone equations without any merit functions, Math. Comput., 77 (2008), 2231–2240. https://doi.org/10.1090/S0025-5718-08-02121-2 doi: 10.1090/S0025-5718-08-02121-2 |
[8] | C. Wang, Y. Wang, A superlinearly convergent projection method for constrained systems of nonlinear equations, J. Global Optim., 44 (2009), 283–296. https://doi.org/10.1007/s10898-008-9324-8 doi: 10.1007/s10898-008-9324-8 |
[9] | A. B. Abubakar, P. Kumam, An improved three-term derivative-free method for solving nonlinear equations, Comput. Appl. Math., 37 (2018), 6760–6773. https://doi.org/10.1007/s40314-018-0712-5 doi: 10.1007/s40314-018-0712-5 |
[10] | A. M. Awwal, P. Kumam, A. B. Abubakar, A modified conjugate gradient method for monotone nonlinear equations with convex constraints, Appl. Numer. Math., 145 (2019), 507–520. https://doi.org/10.1016/j.apnum.2019.05.012 doi: 10.1016/j.apnum.2019.05.012 |
[11] | A. H. Ibrahim, A. I. Garba, H. Usman, J. Abubakar, A. B. Abubakar, Derivative-free RMIL conjugate gradient method for convex constrained equations, Thai J. Math., 18 (2019), 212–232. |
[12] | J. Liu, Y. Feng, A derivative-free iterative method for nonlinear monotone equations with convex constraints, Numerical Algorithms, 82 (2019), 245–262. https://doi.org/10.1007/s11075-018-0603-2 doi: 10.1007/s11075-018-0603-2 |
[13] | Y. Xiao, H. Zhu, A conjugate gradient method to solve convex constrained monotone equations with applications in compressive sensing, J. Math. Anal. Appl., 405 (2013), 310–319. https://doi.org/10.1016/j.jmaa.2013.04.017 doi: 10.1016/j.jmaa.2013.04.017 |
[14] | M. V. Solodov, B. F. Svaiter, A globally convergent inexact Newton method for systems of monotone equations, in Reformulation: Nonsmooth, Piecewise Smooth, Semismooth and Smoothing Methods. Applied Optimization, Springer, Boston, MA, 22 (1999), 355–369. https://doi.org/10.1007/978-1-4757-6388-1_18 |
[15] | G. Yuan, P. Li, J. Lu, The global convergence of the BFGS method with a modified WWP line search for nonconvex functions, Numerical Algorithms, 91 (2022), 353–365. https://doi.org/10.1007/s11075-022-01265-3 doi: 10.1007/s11075-022-01265-3 |
[16] | G. Yuan, M. Zhang, Y. Zhou, Adaptive scaling damped BFGS method without gradient Lipschitz continuity, Appl. Math. Lett., 124 (2022), 107634. https://doi.org/10.1016/j.aml.2021.107634 doi: 10.1016/j.aml.2021.107634 |
[17] | M. V. Solodov, B. F. Svaiter, A new projection method for variational inequality problems, SIAM J. Control Optim., 37 (1999), 765–776. https://doi.org/10.1137/S0363012997317475 doi: 10.1137/S0363012997317475 |
[18] | Y. Dai, Y. Yuan, A nonlinear conjugate gradient method with a strong global convergence property, SIAM J. Control Optim., 10 (1999), 177–182. https://doi.org/10.1137/S1052623497318992 doi: 10.1137/S1052623497318992 |
[19] | R. Fletcher, C. M. Reeves, Function minimization by conjugate gradients, Comput. J., 7 (1964), 149–154. https://doi.org/10.1093/comjnl/7.2.149 doi: 10.1093/comjnl/7.2.149 |
[20] | M. R. Hestenes, E. Stiefel, Methods of conjugate gradients for solving linear systems, J. Res. Natl. Bur. Stand., 49 (1952), 409–436. https://doi.org/10.6028/JRES.049.044 doi: 10.6028/JRES.049.044 |
[21] | S. Rouge, E. Polak, G. Ribière, Note sur la convergence de méthodes de directions conjuguées, Rev. Fr. Inf. Rech. Opérationnelle. Série Rouge, 3 (1969), 35–43. https://doi.org/10.1051/M2AN/196903R100351 doi: 10.1051/M2AN/196903R100351 |
[22] | L. Zhang, W. Zhou, D. Li, Some descent three-term conjugate gradient methods and their global convergence, Optim. Methods Software, 22 (2007), 697–711. https://doi.org/10.1080/10556780701223293 doi: 10.1080/10556780701223293 |
[23] | G. Yuan, H. Yang, M. Zhang, Adaptive three-term PRP algorithms without gradient Lipschitz continuity condition for nonconvex functions, Numerical Algorithms, 91 (2022), 145–160. https://doi.org/10.1007/s11075-022-01257-3 doi: 10.1007/s11075-022-01257-3 |
[24] | X. Wu, H. Shao, P. Liu, Y. Zhuo, An inertial spectral CG projection method based on the memoryless BFGS update, J. Optim. Theory Appl., 198 (2023), 1130–1155. https://doi.org/10.1007/s10957-023-02265-6 doi: 10.1007/s10957-023-02265-6 |
[25] | H. Shao, H. Guo, X. Wu, P. Liu, Two families of self-adjusting spectral hybrid DL conjugate gradient methods and applications in image denoising, Appl. Math. Modell., 118 (2023), 393–411. https://doi.org/10.1016/j.apm.2023.01.018 doi: 10.1016/j.apm.2023.01.018 |
[26] | R. Huang, Y. Qin, K. Liu, G. Yuan, Biased stochastic conjugate gradient algorithm with adaptive step size for nonconvex problems, Expert Syst. Appl., 238 (2024), 121556. https://doi.org/10.1016/j.eswa.2023.121556 doi: 10.1016/j.eswa.2023.121556 |
[27] | C. Ouyang, C. Lu, X. Zhao, R. Huang, G. Yuan, Y. Jiang, Stochastic three-term conjugate gradient method with variance technique for non-convex learning, Stat. Comput., 34 (2024), 107. https://doi.org/10.1007/s11222-024-10409-5 doi: 10.1007/s11222-024-10409-5 |
[28] | G. Yuan, X. Wang, Z. Sheng, Family weak conjugate gradient algorithms and their convergence analysis for nonconvex functions, Numerical Algorithms, 84 (2020), 935–956. https://doi.org/10.1007/s11075-019-00787-7 doi: 10.1007/s11075-019-00787-7 |
[29] | W. Hu, J. Wu, G. Yuan, Some modified Hestenes-Stiefel conjugate gradient algorithms with application in image restoration, Appl. Numer. Math., 158 (2020), 360–376. https://doi.org/10.1016/j.apnum.2020.08.009 doi: 10.1016/j.apnum.2020.08.009 |
[30] | Q. Li, D. Li, A class of derivative-free methods for large-scale nonlinear monotone equations, IMA J. Numer. Anal., 31 (2011), 1625–1635. https://doi.org/10.1093/imanum/drq015 doi: 10.1093/imanum/drq015 |
[31] | G. Wu, Y. Li, G. Yuan, A three-term conjugate gradient algorithm with quadratic convergence for unconstrained optimization problems, Math. Probl. Eng., 2018 (2018). https://doi.org/10.1155/2018/4813030 doi: 10.1155/2018/4813030 |
[32] | B. T. Polyak, The conjugate gradient method in extremal problems, USSR Comput. Math. Math. Phys., 9 (1969), 94–112. https://doi.org/10.1016/0041-5553(69)90035-4 doi: 10.1016/0041-5553(69)90035-4 |
[33] | G. Yuan, Z. Wei, S. Lu, Limited memory BFGS method with backtracking for symmetric nonlinear equations, Math. Comput. Modell., 54 (2011), 367–377. https://doi.org/10.1016/j.mcm.2011.02.021 doi: 10.1016/j.mcm.2011.02.021 |
[34] | E. D. Dolan, J. J. Moré, Benchmarking optimization software with performance profiles, Math. Program., 91 (2002), 201–213. https://doi.org/10.1007/s101070100263 doi: 10.1007/s101070100263 |
[35] | N. Andrei, An unconstrained optimization test functions collection, Adv. Model. Optim., 10 (2008), 147–161. |
[36] | M. A. T. Figueiredo, R. D. Nowak, An EM algorithm for wavelet-based image restoration, IEEE Trans. Image Process., 12 (2003), 906–916. https://doi.org/10.1109/TIP.2003.814255 doi: 10.1109/TIP.2003.814255 |
[37] | C. De Mol, M. Defrise, A note on wavelet-based inversion algorithms, Contemp. Math., 313 (2002), 85–96. https://doi.org/10.1090/conm/313/05370 doi: 10.1090/conm/313/05370 |
[38] | J. Yang, W. Yin, Y. Zhang, Y. Wang, A fast algorithm for edge-preserving variational multichannel image restoration, SIAM J. Imag. Sci., 2 (2009), 569–592. https://doi.org/10.1137/080730421 doi: 10.1137/080730421 |
[39] | M. A. T. Figueiredo, R. D. Nowak, S. J. Wright, Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems, IEEE J. Sel. Top. Signal Process., 1 (2007), 586–597. https://doi.org/10.1109/JSTSP.2007.910281 doi: 10.1109/JSTSP.2007.910281 |
[40] | Y. Xiao, Q. Wang, Q. Hu, Non-smooth equations based method for $\ell_1$-norm problems with applications to compressed sensing, Nonlinear Anal. Theory Methods Appl., 74 (2011), 3570–3577. https://doi.org/10.1016/j.na.2011.02.040 doi: 10.1016/j.na.2011.02.040 |
[41] | J. Pang, Inexact Newton methods for the nonlinear complementarity problem, Math. Program., 36 (1986), 54–71. https://doi.org/10.1007/BF02591989 doi: 10.1007/BF02591989 |
[42] | A. Mousavi, M. Esmaeilpour, A. Sheikhahmadi, A new family of Polak-Ribière-Polyak conjugate gradient method for impulse noise removal, Soft Comput., 27 (2023), 17515–17524. https://doi.org/10.1007/s00500-023-09232-3 doi: 10.1007/s00500-023-09232-3 |
[43] | G. Yuan, Y. Zhou, L. Wang, Q. Yang, Stochastic bigger subspace algorithms for nonconvex stochastic optimization, IEEE Access, 9 (2021), 119818–119829. https://doi.org/10.1109/ACCESS.2021.3108418 doi: 10.1109/ACCESS.2021.3108418 |
[44] | L. Bottou, Large-scale machine learning with stochastic gradient descent, in Proceedings of COMPSTAT'2010: 19th International Conference on Computational StatisticsParis France, August 22-27, 2010 Keynote, Invited and Contributed Papers, (2010), 177–186. https://doi.org/10.1007/978-3-7908-2604-3_16 |
[45] | R. Johnson, T. Zhang, Accelerating stochastic gradient descent using predictive variance reduction, Adv. Neural Inf. Process. Syst., 26 (2013). |