Multi-label feature selection, an essential means of data dimension reduction in multi-label learning, has become one of the research hotspots in the field of machine learning. Because the linear assumption of sample space and label space is not suitable in most cases, many scholars use pseudo-label space. However, the use of pseudo-label space will increase the number of model variables and may lead to the loss of sample or label information. A multi-label feature selection scheme based on constraint mapping space regularization is proposed to solve this problem. The model first maps the sample space to the label space through the use of linear mapping. Second, given that the sample cannot be perfectly mapped to the label space, the mapping space should be closest to the label space and still retain the space of the basic manifold structure of the sample space, so combining the Hilbert-Schmidt independence criterion with the sample manifold, basic properties of constraint mapping space. Finally, the proposed algorithm is compared with MRDM, SSFS, and other algorithms on multiple classical multi-label data sets; the results show that the proposed algorithm is effective on multiple indicators.
Citation: Bangna Li, Qingqing Zhang, Xingshi He. Multi-label feature selection via constraint mapping space regularization[J]. Electronic Research Archive, 2024, 32(4): 2598-2620. doi: 10.3934/era.2024118
Multi-label feature selection, an essential means of data dimension reduction in multi-label learning, has become one of the research hotspots in the field of machine learning. Because the linear assumption of sample space and label space is not suitable in most cases, many scholars use pseudo-label space. However, the use of pseudo-label space will increase the number of model variables and may lead to the loss of sample or label information. A multi-label feature selection scheme based on constraint mapping space regularization is proposed to solve this problem. The model first maps the sample space to the label space through the use of linear mapping. Second, given that the sample cannot be perfectly mapped to the label space, the mapping space should be closest to the label space and still retain the space of the basic manifold structure of the sample space, so combining the Hilbert-Schmidt independence criterion with the sample manifold, basic properties of constraint mapping space. Finally, the proposed algorithm is compared with MRDM, SSFS, and other algorithms on multiple classical multi-label data sets; the results show that the proposed algorithm is effective on multiple indicators.
[1] | J. Gui, Z. N. Sun, S. W. Ji, D. C. Tao, T. N. Tan, Feature selection based on structured sparsity: A comprehensive study, IEEE Trans. Neural Networks Learn. Syst., 28 (2016), 1–18. https://doi.org/10.1109/TNNLS.2016.2551724 doi: 10.1109/TNNLS.2016.2551724 |
[2] | M. Paniri, M. B. Dowlatshahi, H. Nezamabadi-Pour, MLACO: A multi-label feature selection algorithm based on ant colony optimization, Knowl. Based Syst., 192 (2019), 105285. https://doi.org/10.1016/j.knosys.2019.105285 doi: 10.1016/j.knosys.2019.105285 |
[3] | S. Kashef, H. Nezamabadi-Pour, B. Nikpour, Multi-label feature selection: A comprehensive review and guiding experiments, Wiley Interdiscip. Rev. Data Min. Knowl. Discovery, 8 (2018), 12–40. https://doi.org/10.1002/widm.1240 doi: 10.1002/widm.1240 |
[4] | Y. Saeys, I. Inza, P. Larranaga, A review of feature selection techniques in bioinformatics, Bioinformatics, 23 (2007), 2507–2517. https://doi.org/10.1093/bioinformatics/btm344 doi: 10.1093/bioinformatics/btm344 |
[5] | C. C. Ding, M. Zhao, J. Lin, J. Y. Jiao, Multi-objective iterative optimization algorithm based optimal wavelet filter selection for multi-fault diagnosis of rolling element bearings, ISA Trans., 82 (2019), 199–215. https://doi.org/10.1016/j.isatra.2018.12.010 doi: 10.1016/j.isatra.2018.12.010 |
[6] | M. Labani, P. Moradi, F. Ahmadizar, M. Jalili, A novel multivariate filter method for feature selection in text classification problems, Eng. Appl. Artif. Intell., 70 (2018), 25–37. https://doi.org/10.1016/j.engappai.2017.12.014 doi: 10.1016/j.engappai.2017.12.014 |
[7] | C. Yao, Y. F. Liu, B. Jiang, J. G. Han, J. W. Han, LLE score: A new filter-based unsupervised feature selection method based on nonlinear manifold embedding and its application to image recognition, IEEE Trans. Image Process., 26 (2017), 5257–5269. https://doi.org/10.1109/TIP.2017.2733200 doi: 10.1109/TIP.2017.2733200 |
[8] | J. González, J. Ortega, M. Damas, P. Martín-Smith, J. Q. Gan, A new multi-objective wrapper method for feature selection–Accuracy and stability analysis for BCI, Neurocomputing, 333 (2019), 407–418. https://doi.org/10.1016/j.neucom.2019.01.017 doi: 10.1016/j.neucom.2019.01.017 |
[9] | J. Swati, H. Hongmei, J. Karl, Information gain directed genetic algorithm wrapper feature selection for credit rating, Appl. Soft Comput., 69 (2018), 541–553. https://doi.org/10.1016/j.asoc.2018.04.033 doi: 10.1016/j.asoc.2018.04.033 |
[10] | S. Maldonado, J. López, Dealing with high-dimensional class-imbalanced datasets: Embedded feature selection for SVM classification, Appl. Soft Comput., 67 (2018), 94–105. https://doi.org/10.1016/j.asoc.2018.02.051 doi: 10.1016/j.asoc.2018.02.051 |
[11] | Y. C. Kong, T. W. Yu, A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data, Bioinformatics, 34 (2018), 3727–3737. https://doi.org/10.1093/bioinformatics/bty429 doi: 10.1093/bioinformatics/bty429 |
[12] | Y. Zhang, Y. C. Ma, X. F. Yang, Multi-label feature selection based on logistic regression and manifold learning, Appl. Intell., 2022 (2022), 1–18. https://doi.org/10.1007/s10489-021-03008-8 doi: 10.1007/s10489-021-03008-8 |
[13] | S. Liaghat, E. G. Mansoori, Filter-based unsupervised feature selection using Hilbert–-Schmidt independence criterion, Int. J. Mach. Learn. Cybern., 10 (2019), 2313–2328. https://doi.org/10.1007/s13042-018-0869-7 doi: 10.1007/s13042-018-0869-7 |
[14] | J. Lee, D. W. Kim, SCLS: Multi-label feature selection based on scalable criterion for large label set, Pattern Recognit., 66 (2017), 342–352. https://doi.org/10.1016/j.patcog.2017.01.014 doi: 10.1016/j.patcog.2017.01.014 |
[15] | Y. J. Lin, Q. H. Hu, J. H. Liu, J. Duan, Multi-label feature selection based on maxdependency and min-redundancy, Neurocomputing, 168 (2015), 92–103. https://doi.org/10.1016/j.neucom.2015.06.010 doi: 10.1016/j.neucom.2015.06.010 |
[16] | J. Lee, D. W. Kim, Feature selection for multi-label classification using multivariate mutual information, Pattern Recognit. Lett., 34 (2013), 349–357. https://doi.org/10.1016/j.patrec.2012.10.005 doi: 10.1016/j.patrec.2012.10.005 |
[17] | J. Lee, D. W. Kim, Fast multi-label feature selection based on information-theoretic feature ranking, Pattern Recognit., 48 (2015), 2761–2771. https://doi.org/10.1016/j.patcog.2015.04.009 doi: 10.1016/j.patcog.2015.04.009 |
[18] | W. F. Gao, L. Hu, P. Zhang, Class-specific mutual information variation for feature selection, Pattern Recognit., 79 (2018), 328–339. https://doi.org/10.1016/j.patcog.2018.02.020 doi: 10.1016/j.patcog.2018.02.020 |
[19] | J. Lee, D. W. Kim, Scalable multi-label learning based on feature and label dimensionality reduction, Complexity, 23 (2018), 1–15. https://doi.org/10.1155/2018/6292143 doi: 10.1155/2018/6292143 |
[20] | P. Zhang, W. F. Gao, J. C. Hu, Y. H. Li, Multi-label feature selection based on high-order label correlation assumption, Entropy, 22 (2020), 797. https://doi.org/10.3390/e22070797 doi: 10.3390/e22070797 |
[21] | W. F. Gao, P. T. Hao, Y. Wu, P. Zhang, A unified low-order information-theoretic feature selection framework for multi-label learning, Pattern Recognit., 134 (2023), 109111. https://doi.org/10.1016/j.patcog.2022.109111 doi: 10.1016/j.patcog.2022.109111 |
[22] | Y. H. Li, L. Hu, W. F. Gao, Multi-label feature selection via robust flexible sparse regularization, Pattern Recognit., 134 (2023), 109074. https://doi.org/10.1016/j.patcog.2022.109074 doi: 10.1016/j.patcog.2022.109074 |
[23] | Y. H. Li, L. Hu, W. F. Gao, Multi-label feature selection with high-sparse personalized and low-redundancy shared common features, Inf. Process. Manage., 61 (2024), 103633. https://doi.org/10.1016/j.ipm.2023.103633 doi: 10.1016/j.ipm.2023.103633 |
[24] | X. C. Hu, Y. H. Shen, W. Pedrycz, X. M. Wang, A. Gacek, B. S. Liu, Identification of fuzzy rule-based models with collaborative fuzzy clustering, IEEE Trans. Cybern., 2021 (2021), 1–14. https://doi.org/10.1109/TCYB.2021.3069783 doi: 10.1109/TCYB.2021.3069783 |
[25] | K. Y. Liu, X. B. Yang, H. Fujita, D. Liu, X. Yang, Y. H. Qian, An efficient selector for multi-granularity attribute reduction, Inf. Sci., 505 (2019), 457–472. https://doi.org/10.1016/j.ins.2019.07.051 doi: 10.1016/j.ins.2019.07.051 |
[26] | Y. Chen, K. Y. Liu, J. J. Song, H. Fujita, X. B. Yang, Y. H. Qian, Attribute group for attribute reduction, Inf. Sci., 535 (2020), 64–80. https://doi.org/10.1016/j.ins.2020.05.010 doi: 10.1016/j.ins.2020.05.010 |
[27] | Y. G. Jing, T. R. Li, H. Fujita, Z. Yu, B. Wang, An incremental attribute reduction approach based on knowledge granularity with a multi-granulation view, Inf. Sci., 411 (2017), 23–38. https://doi.org/10.1016/j.ins.2017.05.003 doi: 10.1016/j.ins.2017.05.003 |
[28] | J. Zhang, Z. M. Luo, C. D. Li, C. G. Zhou, S. Z. Li, Manifold regularized discriminative feature selection for multi-label learning, Pattern Recognit., 95 (2019), 136–150. https://doi.org/10.1016/j.patcog.2019.06.003 doi: 10.1016/j.patcog.2019.06.003 |
[29] | R. Huang, Z. Wu, S. W. Ji, D. C. Tao, T. N. Tan, Multi-label feature selection via manifold regularization and dependence maximization, Pattern Recognit., 120 (2021), 180149. https://doi.org/10.1016/j.patcog.2021.108149 doi: 10.1016/j.patcog.2021.108149 |
[30] | L. Hu, Y. H. Li, W. F. Gao, P. Zhang, J. C. Hu, Multi-label feature selection with shared common mode, Pattern Recognit., 104 (2020), 107344. https://doi.org/10.1016/j.patcog.2020.107344 doi: 10.1016/j.patcog.2020.107344 |
[31] | W. F. Gao, Y. H. Li, L. Hu, Multi-label feature selection with constrained latent structure shared term, IEEE Trans. Neural Networks Learn. Syst., 34 (2023), 1253–1262. https://doi.org/10.1109/TNNLS.2021.3105142 doi: 10.1109/TNNLS.2021.3105142 |
[32] | Y. Zhang, Y. C. Ma, Non-negative multi-label feature selection with dynamic graph constraints, Knowl. Based Syst., 238 (2022), 107924. https://doi.org/10.1016/j.knosys.2021.107924 doi: 10.1016/j.knosys.2021.107924 |
[33] | F. P. Nie, H. Huang, X. Cai, C. Ding, Efficient and robust feature selection via joint $L_{2, 1}$-norms minimization, Adv. Neural Inf. Process. Syst., 2010 (2010), 1813–1821. |
[34] | A. Hashemi, M. Dowlatshahi, H. Nezamabadi-pour, MFS-MCDM: Multi-label feature selection using multi-criteria decision making, Knowl. Based Syst., 206 (2020), 106365. https://doi.org/10.1016/j.knosys.2020.106365 doi: 10.1016/j.knosys.2020.106365 |
[35] | M. L. Zhang, Z. H. Zhou, ML-KNN: A lazy learning approach to multi-label learning, Pattern Recognit., 40 (2007), 2038–2048. https://doi.org/10.1016/j.patcog.2006.12.019 doi: 10.1016/j.patcog.2006.12.019 |
[36] | J. Dougherty, R. Kohavi, M. Sahami, Supervised and unsupervised discretization of continuous features, Mach. Learn. Proc., 1995 (1995), 194–202. https://doi.org/10.1016/B978-1-55860-377-6.50032-3 doi: 10.1016/B978-1-55860-377-6.50032-3 |
[37] | O. J. Dunn, Multiple Comparisons among Means, J. Am. Stat. Assoc., 56 (1961), 52–64. https://doi.org/10.1080/01621459.1961.10482090 doi: 10.1080/01621459.1961.10482090 |
[38] | M. Friedman, A comparison of alternative tests of significance for the problem of m rankings, Ann. Math. Stat., 11 (1940), 86–92. https://doi.org/10.1214/aoms/1177731944 doi: 10.1214/aoms/1177731944 |