Rayleigh-Ritz discriminant analysis (RRDA) is an effective algorithm for linear discriminant analysis (LDA), but there are some drawbacks in its implementation. In this paper, we first improved Rayleigh-Ritz discriminant analysis (IRRDA) to make its framework more concise, and established the equivalence theory of the solution space between our discriminant analysis and RRDA. Second, we proposed a new model based on positive definite systems of linear equations for linear discriminant analysis, and certificated the rationality of the new model. Compared with the traditional linear regression model for linear discriminant analysis, the coefficient matrix of our model avoided forming a centralized matrix or appending the original data matrix, but the original matrix itself, which greatly reduced the computational complexity. According to the size of data matrix, we designed two solution schemes for the new model based on the block conjugate gradient method. Experiments in real-world datasets demonstrated the effectiveness and efficiency of our algorithm and it showed that our method was more efficient and faster than RRDA.
Citation: Wenya Shi, Zhixiang Chen. A breakdown-free block conjugate gradient method for large-scale discriminant analysis[J]. AIMS Mathematics, 2024, 9(7): 18777-18795. doi: 10.3934/math.2024914
Rayleigh-Ritz discriminant analysis (RRDA) is an effective algorithm for linear discriminant analysis (LDA), but there are some drawbacks in its implementation. In this paper, we first improved Rayleigh-Ritz discriminant analysis (IRRDA) to make its framework more concise, and established the equivalence theory of the solution space between our discriminant analysis and RRDA. Second, we proposed a new model based on positive definite systems of linear equations for linear discriminant analysis, and certificated the rationality of the new model. Compared with the traditional linear regression model for linear discriminant analysis, the coefficient matrix of our model avoided forming a centralized matrix or appending the original data matrix, but the original matrix itself, which greatly reduced the computational complexity. According to the size of data matrix, we designed two solution schemes for the new model based on the block conjugate gradient method. Experiments in real-world datasets demonstrated the effectiveness and efficiency of our algorithm and it showed that our method was more efficient and faster than RRDA.
[1] | L. C. Hu, W. S. Zhang, Orthogonal neighborhood preserving discriminant analysis with patch embedding for face recognition, Pattern Recogn., 106 (2020), 107450. http://doi.org/10.1016/j.patcog.2020.107450 doi: 10.1016/j.patcog.2020.107450 |
[2] | A. Sasithradevi, S. M. M. Roomi, Video classification and retrieval through spatio-temporal Radon features, Pattern Recogn., 99 (2020), 107099. https://doi.org/10.1016/j.patcog.2019.107099 doi: 10.1016/j.patcog.2019.107099 |
[3] | W. Y. Shi, Y. W. Lou, G. Wu, On general matrix exponential discriminant analysis methods for high dimensionality reduction, Calcolo, 57 (2020), 18. http://doi.org/10.1007/s10092-020-00366-6 doi: 10.1007/s10092-020-00366-6 |
[4] | K. Fukunaga, Introduction to statistical pattern classification, USA: Academic Press, 1990. |
[5] | T. Hastie, R. Tibshirani, J. Friedman, The elements of statistical learning: Data mining, inference, and prediction, New York: Springer, 2000. |
[6] | J. W. Chen, S. Y. Xie, H. Jiang, H. Y. Yang, F. P. Nie, A novel $k$-Means framework via constrained relaxation and spectral rotation, IEEE T. Neur. Net. Lear., 2023, 1–14. http://doi.org/10.1109/TNNLS.2023.3282938 |
[7] | Z. X. Li, F. P. Nie, R. Wang, X. L. Li, A revised formation of trace ratio LDA for small sample size problem, IEEE T. Neur. Net. Lear., 2024, 1–7. http://doi.org/10.1109/TNNLS.2024.3362512 |
[8] | J. P. Ye, Least squares linear discriminant analysis, Proceedings of the 24th international conference on machine learning, 2007, 1087–1093. http://doi.org/10.1145/1273496.1273633 |
[9] | R. S. S. Kramer, A. W. Young, A. M. Burton, Understanding face familiarity, Cognition, 172 (2018), 46–58. http://doi.org/10.1016/j.cognition.2017.12.005 doi: 10.1016/j.cognition.2017.12.005 |
[10] | Y. D. Lu, G. Wu, Fast and incremental algorithms for exponential semi-supervised discriminant embedding, Pattern Recogn., 108 (2020), 107530. http://doi.org/10.1016/j.patcog.2020.107530 doi: 10.1016/j.patcog.2020.107530 |
[11] | M. Mohri, A. Rostamizadeh, A. Talwalkar, Foundations of machine learning, Cambridge: The MIT Press, 2018. |
[12] | G. Wu, T. T. Feng, L. J. Zhang, M. Yang, Inexact implementation using Krylov subspace methods for large scale exponential discriminant analysis with applications to high dimensionality reduction problems, Pattern Recogn., 66 (2017), 328–341. http://doi.org/10.1016/J.PATCOG.2016.08.020 doi: 10.1016/J.PATCOG.2016.08.020 |
[13] | C. X. Ren, D. Q. Dai, X. F. He, H. Yan, Sample weighting: An inherent approach for outlier suppressing discriminant analysis, IEEE T. Knowl. Data En., 27 (2015), 3070–3083. http://doi.org/10.1109/TKDE.2015.2448547 doi: 10.1109/TKDE.2015.2448547 |
[14] | Y. F. Yu, C. X. Ren, M. Jiang, M. Y. Sun, D. Q. Dai, G. D. Guo, Sparse approximation to discriminant projection learning and application to image classification, Pattern Recogn., 96 (2019), 106963. http://doi.org/10.1016/J.PATCOG.2019.106963 doi: 10.1016/J.PATCOG.2019.106963 |
[15] | L. Wu, C. H. Shen, A. V. D. Hengel, Deep linear discriminant analysis on sher networks: A hybrid architecture for person re-identication, Pattern Recogn., 65 (2017), 238–250. https://doi.org/10.1016/j.patcog.2016.12.022 doi: 10.1016/j.patcog.2016.12.022 |
[16] | C. Moulin, C. Largeron, C. Ducottet, M. Gery, C. Barat, Fisher linear discriminant analysis for text-image combination in multimedia information retrieval, Pattern Recogn., 47 (2014), 260–269. http://doi.org/10.1016/J.PATCOG.2013.06.003 doi: 10.1016/J.PATCOG.2013.06.003 |
[17] | H. S. Ye, Y. J. Li, C. Chen, Z. H. Zhang, Fast fisher discriminant analysis with randomized algorithms, Pattern Recogn., 72 (2017), 82–92. http://dx.doi.org/10.1016/J.PATCOG.2017.06.029 doi: 10.1016/J.PATCOG.2017.06.029 |
[18] | L. Zhu, D. S. Huang, A Rayleigh-Ritz style method for large-scale discriminant analysis, Pattern Recogn., 47 (2014), 1698–1708. http://doi.org/10.1016/j.patcog.2013.10.007 doi: 10.1016/j.patcog.2013.10.007 |
[19] | J. H. Friedman, Regularized discriminant analysis, J. Am. Stat. Assoc., 84 (1989), 165–175. https://doi.org/10.1080/01621459.1989.10478752 doi: 10.1080/01621459.1989.10478752 |
[20] | X. W. Zhang, L. Chen, D. L. Chu, L. Z. Liao, M. K. Ng, R. C. E. Tan, Incremental regularized least squares for dimensionality reduction of large-scale data, SIAM J. Sci. Comput., 38 (2016), B414–B439. http://doi.org/10.1137/15M1035653 doi: 10.1137/15M1035653 |
[21] | A. Beck, R. Sharon, A branch and bound method solving the max-min linear discriminant analysis problem, Optim. Method. Softw., 38 (2023), 1031–1057. https://doi.org/10.1080/10556788.2023.2198769 doi: 10.1080/10556788.2023.2198769 |
[22] | J. H. Zhao, H. Y. Liang, S. L. Li, Z. J. Yang, Z. Wang, Matrix-based vs. vector-based linear discriminant analysis: A comparison of regularized variants on multivariate time series data, Inform. Sciences, 654 (2024), 119872. https://doi.org/10.1016/j.ins.2023.119872 doi: 10.1016/j.ins.2023.119872 |
[23] | D. Cai, X. F. He, J. W. Han, SRDA: An efficient algorithm for large-scale discriminant analysis, IEEE T. Knowl. Data En., 20 (2008), 1–12. http://dx.doi.org/10.1109/TKDE.2007.190669 doi: 10.1109/TKDE.2007.190669 |
[24] | J. P. Ye, Q. Li, LDA/QR: An efficient and effective dimension reduction algorithm and its theoretical foundation, Pattern Recogn., 37 (2004), 851–854. http://doi.org/10.1016/J.PATCOG.2003.08.006 doi: 10.1016/J.PATCOG.2003.08.006 |
[25] | P. Howland, H. Park, Generalizing discriminant analysis using the generalized singular value decomposition, IEEE T. Pattern Anal., 26 (2004), 995–1006. http://doi.org/10.1109/TPAMI.2004.46 doi: 10.1109/TPAMI.2004.46 |
[26] | E. I. G. Nassara, E. Maes, M. Kharouf, Linear discriminant analysis for large-scale data: Application on text and image data, IEEE, 2016,961–964. http://doi.org/10.1109/ICMLA.2016.0173 |
[27] | Z. H. Zhang, G. Dai, C. F. Xu, M. I. Jordan, Regularized discriminant analysis, ridge regression and beyond, J. Mach. Learn. Res., 11 (2010), 2199–2228. http://doi.org/10.5555/1756006.1859927 doi: 10.5555/1756006.1859927 |
[28] | W. Y. Shi, G. Wu, New algorithms for trace-ratio problem with application to high-dimension and large-sample data dimensionality reduction, Mach. Learn., 2021. https://doi.org/10.1007/s10994-020-05937-w |
[29] | C. H. Park, H. Park, A relationship between linear discriminant analysis and the generalized minimum squared error solution, SIAM J. Matrix Anal. A., 27 (2005), 474–492. http://doi.org/10.1137/040607599 doi: 10.1137/040607599 |
[30] | X. H. Chen, T. Chen, Low-rank approximation-based bidirectional linear discriminant analysis for image data, Multimed. Tools Appl., 83 (2024), 19369–19389. https://doi.org/10.1007/s11042-023-16239-3 doi: 10.1007/s11042-023-16239-3 |
[31] | L. Sun, B. Ceran, J. P. Ye, A scalable two-stage approach for a class of dimensionality reduction techniques, Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, 2010,313–322. http://doi.org/10.1145/1835804.1835846 |
[32] | H. Ji, Y. H. Li, A breakdown-free block conjugate gradient method, BIT Numer. Math., 57 (2017), 379–403. http://doi.org/10.1007/s10543-016-0631-z doi: 10.1007/s10543-016-0631-z |
[33] | L. Wolf, T. Hassner, I. Maoz, Face recognition in unconstrained videos with matched background similarity, IEEE, 2011,529–534. http://doi.org/10.1109/CVPR.2011.5995566 |
[34] | T. Cover, P. Hart, Nearest neighbor pattern classification, IEEE T. Inform. Theory, 13 (1967), 21–27. http://doi.org/10.1109/TIT.1967.1053964 doi: 10.1109/TIT.1967.1053964 |