Research article

Multi-label feature selection based on HSIC and sparrow search algorithm


  • Received: 27 April 2023 Revised: 10 June 2023 Accepted: 20 June 2023 Published: 26 June 2023
  • Feature selection has always been an important topic in machine learning and data mining. In multi-label learning tasks, each sample in the dataset is associated with multiple labels, and labels are usually related to each other. At the same time, multi-label learning has the problem of "curse of dimensionality". Feature selection therefore becomes a difficult task. To solve this problem, this paper proposes a multi-label feature selection method based on the Hilbert-Schmidt independence criterion (HSIC) and sparrow search algorithm (SSA). It uses SSA for feature search and HSIC as feature selection criterion to describe the dependence between features and all labels, so as to select the optimal feature subset. Experimental results demonstrate the effectiveness of the proposed method.

    Citation: Tinghua Wang, Huiying Zhou, Hanming Liu. Multi-label feature selection based on HSIC and sparrow search algorithm[J]. Mathematical Biosciences and Engineering, 2023, 20(8): 14201-14221. doi: 10.3934/mbe.2023635

    Related Papers:

  • Feature selection has always been an important topic in machine learning and data mining. In multi-label learning tasks, each sample in the dataset is associated with multiple labels, and labels are usually related to each other. At the same time, multi-label learning has the problem of "curse of dimensionality". Feature selection therefore becomes a difficult task. To solve this problem, this paper proposes a multi-label feature selection method based on the Hilbert-Schmidt independence criterion (HSIC) and sparrow search algorithm (SSA). It uses SSA for feature search and HSIC as feature selection criterion to describe the dependence between features and all labels, so as to select the optimal feature subset. Experimental results demonstrate the effectiveness of the proposed method.



    加载中


    [1] J. Li, K. Cheng, S. Wang, F. Morstatter, R. P. Trevino, J. Tang, et al., Feature selection: a data perspective, ACM Comput. Surv., 50 (2018), 1–45. https://doi.org/10.1145/3136625 doi: 10.1145/3136625
    [2] H. Zhou, T. Wang, D. Zhang, Research progress of multi-label feature selection, Comput. Eng. Appl., 58 (2022), 52–67. https://doi.org/10.3778/J.ISSN.1002-8331.2202-0114 doi: 10.3778/J.ISSN.1002-8331.2202-0114
    [3] T. Wang, X. Dai, Y. Liu, Learning with Hilbert-Schmidt independence criterion: A review and new perspectives, Knowl. Based Syst., 234 (2021), 107567. https://doi.org/10.1016/j.knosys.2021.107567 doi: 10.1016/j.knosys.2021.107567
    [4] A. Saxena, M. Prasad, A. Gupta, N. Bharill, O. P. Patel, A. Tiwari, et al., A review of clustering techniques and developments, Neurocomputing, 267 (2017), 664–681. https://doi.org/10.1016/j.neucom.2017.06.053 doi: 10.1016/j.neucom.2017.06.053
    [5] S. Ayesha, M. K. Hanif, R. Talib, Overview and comparative study of dimensionality reduction techniques for high dimensional data, Inf. Fusion, 59 (2020), 44–58. https://doi.org/10.1016/j.inffus.2020.01.005 doi: 10.1016/j.inffus.2020.01.005
    [6] T. Wang, Z. Hu, H. Liu, A unified view of feature selection based on Hilbert-Schmidt independence criterion, Chem. Intell. Lab. Syst., 236 (2023), 104807. https://doi.org/10.1016/j.chemolab.2023.104807 doi: 10.1016/j.chemolab.2023.104807
    [7] A. Tharwat, Independent component analysis: An introduction, Appl. Comput. Inf., 17 (2021), 222–249. https://doi.org/10.1016/j.aci.2018.08.006 doi: 10.1016/j.aci.2018.08.006
    [8] Y. Zhang, X. Xiu, Y. Yang, W. Liu, Fault detection based on canonical correlation analysis with rank constrained optimization, in The 2021 40th Chinese Control Conference, (2021). https://doi.org/10.26914/c.cnkihy.2021.028664
    [9] L. Zhang, T. Wang, H. Zhou, A multi-strategy improved sparrow search algorithm, Comput. Eng. Appl., 58 (2022), 133–140. https://doi.org/10.3778/j.issn.1002-8331.2112-0427 doi: 10.3778/j.issn.1002-8331.2112-0427
    [10] M. Paniri, M. B. Dowlatshahi, H. Nezamabadi-pour, MLACO: A multi-label feature selection algorithm based on ant colony optimization, Knowl. Based Syst., 193 (2019), 105285. https://doi.org/10.1016/j.knosys.2019.105285 doi: 10.1016/j.knosys.2019.105285
    [11] M. Paniri, M. B. Dowlatshahi, H. Nezamabadi-pour, Ant-TD: Ant colony optimization plus temporal difference reinforcement learning for multi-label feature selection, Swarm Evol. Comput., 64 (2021), 100892. https://doi.org/10.1016/j.swevo.2021.100892 doi: 10.1016/j.swevo.2021.100892
    [12] Y. Zhang, D. Gong, X. Sun, Y. Guo, A PSO-based multi- objective multi-label feature selection method in classification, Sci. Rep., 7 (2017), 376. https://doi.org/10.1038/s41598-017-00416-0 doi: 10.1038/s41598-017-00416-0
    [13] D. Paul, A. Jain, S. Saha, J. Mathew, Multi-objective PSO based online feature selection for multi-label classification, Knowl. Based Syst., 222 (2022), 106966. https://doi.org/10.1016/j.knosys.2021.106966 doi: 10.1016/j.knosys.2021.106966
    [14] Z. Lu, X. Cheng, Y. Zhang, Global optimization method based on consensus particle swarm, J. Syst. Simul., 32 (2020), 1936–1942. https://doi.org/10.16182/j.issn1004731x.joss.20-fz0371 doi: 10.16182/j.issn1004731x.joss.20-fz0371
    [15] M. Abdel-Basset, D. El-Shahat, I. El-Henawy, V. Albuquerque, S. Mirjalili, A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection, Expert Syst. Appl., 139 (2020), 112824. https://doi.org/10.1016/j.eswa.2019.112824 doi: 10.1016/j.eswa.2019.112824
    [16] W. Li, Y. Li, Y. Zhao, B. Yan, Research on particle filter algorithm based on improved grey wolf algorithm, J. Syst. Simul., 33 (2021), 37–45. https://doi.org/10.16182/j.issn1004731x.joss.19-0276 doi: 10.16182/j.issn1004731x.joss.19-0276
    [17] J. Xue, B. Shen, A novel swarm intelligence optimization approach: sparrow search algorithm, Syst. Sci. Control Eng., 8 (2020), 22–34. https://doi.org/10.1080/21642583.2019.1708830 doi: 10.1080/21642583.2019.1708830
    [18] L. Sun, Y. Chen, J. Xu, Multi-label feature selection algorithm based on improved ReliefF, J. Shandong Univ. Nat. Sci., 57 (2022), 1–11. https://doi.org/10.6040/j.issn.1671-9352.7.2021.167 doi: 10.6040/j.issn.1671-9352.7.2021.167
    [19] J. Gonzalez-Lopez, S. Ventura, A. Cano, Distributed multi-label feature selection using individual mutual information measures, Knowl. Based Syst., 188 (2020), 105052. https://doi.org/10.1016/j.knosys.2019.105052 doi: 10.1016/j.knosys.2019.105052
    [20] J. Gonzalez-Lopez, S. Ventura, A. Cano, Distributed selection of continuous features in multilabel classification using mutual information, IEEE Trans. Neural Networks Learn. Syst., 31 (2020), 2280–2293. https://doi.org/10.1109/TNNLS.2019.2944298 doi: 10.1109/TNNLS.2019.2944298
    [21] C. Xiong, W. Qian, Y. Wang, J. Huang, Feature selection based on label distribution and fuzzy mutual information, Inf. Sci., 574 (2021), 297–319. https://doi.org/10.1016/j.ins.2021.06.005 doi: 10.1016/j.ins.2021.06.005
    [22] Z. Sha, Z. Liu, C. Ma, J Chen, Feature selection for multi-label classification by maximizing full-dimensional conditional mutual information, Appl. Intell., 51 (2021), 326–340. https://doi.org/10.1007/s10489-020-01822-0 doi: 10.1007/s10489-020-01822-0
    [23] C. Liu, Q. Ma, J. Xu, Multi-label feature selection method combining unbiased Hilbert-Schmidt independence criterion with controlled genetic algorithm, Lect. Notes Comput. Sci., 11304 (2018), 3–14. https://doi.org/10.1007/978-3-030-04212-7_1 doi: 10.1007/978-3-030-04212-7_1
    [24] G. Li, Y. Li, Y. Zheng, Y. Li, Y. Hong, X. Zhou, A novel feature selection approach with Pareto optimality for multi-label data. Appl. Intell., 51 (2021), 7794–7811. https://doi.org/10.1007/s10489-021-02228-2 doi: 10.1007/s10489-021-02228-2
    [25] G. Li, Y. Li, Y. Zheng, A novel multi-label feature selection based on pareto optimality, Lect. Notes Data Eng. Commun. Technol., 88 (2021), 1010–1016. https://doi.org/10.1007/978-3-030-70665-4_109 doi: 10.1007/978-3-030-70665-4_109
    [26] Y. Li, Binary sparrow search algorithm and its application in feature selection, Master thesis, Tianjin Normal University, 2022. https://doi.org/10.27363/d.cnki.gtsfu.2022.000316
    [27] T. Wang, W. Li, Kernel learning and optimization with Hilbert-Schmidt independence criterion, Int. J. Mach. Learn. Cybern., 9 (2018), 1707–1717. https://doi.org/10.1007/s13042-017-0675-7 doi: 10.1007/s13042-017-0675-7
    [28] Z. Hu, T. Wang, H. Zhou, Review of feature selection methods based on kernel statistical independence criteria, Comput. Eng. Appl., 58 (2022), 54–64. https://doi.org/10.3778/j.issn.1002-8331.2203-0527 doi: 10.3778/j.issn.1002-8331.2203-0527
    [29] X. Tian, J. He, Y. Shi, Statistical dependence test with Hilbert-Schmidt independence criterion, J. Phys. Confer. Ser., 1601 (2020), 032008. https://doi.org/10.1088/1742-6596/1601/3/032008 doi: 10.1088/1742-6596/1601/3/032008
    [30] B. B. Damodaran, N. Courty, S. Lefèvre, Sparse Hilbert Schmidt independence criterion and surrogate-kernel-based feature selection for hyperspectral image classification, IEEE Trans. Geosci. Remote Sens., 55 (2017), 2385–2398. https://doi.org/10.1109/TGRS.2016.2642479 doi: 10.1109/TGRS.2016.2642479
    [31] X. Lü, X. Mu, J. Zhang, Z. Wang, Chaotic sparrow search optimization algorithm, J. Beijing Univ. Aeronaut. Astronaut., 47 (2021), 1712–1720. https://doi.org/10.13700/j.bh.1001-5965.2020.0298 doi: 10.13700/j.bh.1001-5965.2020.0298
    [32] M. L. Zhang, Z. H. Zhou, A review on multi-label learning algorithms, IEEE Trans. Knowl. Data Eng., 26 (2014), 1819–1837. https://doi.org/10.1109/TKDE.2013.39 doi: 10.1109/TKDE.2013.39
    [33] J. Zhang, Y. Lin, M. Jiang, S. Li, Y. Tang, K. C. Tan, Multi-label feature selection via global relevance and redundancy optimization, in The 29th International Joint Conference on Artificial Intelligence, (2020). https://doi.org/10.24963/ijcai.2020/348
    [34] J. Lee, D. W. Kim, Fast multi-label feature selection based on information-theoretic feature ranking, Pattern Recognit., 48 (2015), 2761–2771. https://doi.org/10.1016/j.patcog.2015.04.009 doi: 10.1016/j.patcog.2015.04.009
    [35] G. Doquire, M. Verleysen, Mutual information-based feature selection for multilabel classification, Neurocomputing, 122 (2013), 148–155. https://doi.org/10.1016/j.neucom.2013.06.035 doi: 10.1016/j.neucom.2013.06.035
    [36] G. Doquire, M. Verleysen, Feature selection for multi-label classification problems, in The 11th International Conference on Artificial Neural Networks, (2011). https://doi.org/10.1007/978-3-642-21501-8_2
    [37] K. Trochidis, G. Tsoumakas, G. Kalliris, I. Vlahavas, Multilabel classification of music into emotions, in The 9th International Conference on Music Information Retrieval, (2008). https://doi.org/10.1186/1687-4722-2011-426793
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1296) PDF downloads(57) Cited by(1)

Article outline

Figures and Tables

Figures(6)  /  Tables(7)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog