Research article Special Issues

An adaptive feature selection algorithm based on MDS with uncorrelated constraints for tumor gene data classification


  • Received: 15 October 2022 Revised: 25 December 2022 Accepted: 08 January 2023 Published: 03 February 2023
  • The developing of DNA microarray technology has made it possible to study the cancer in view of the genes. Since the correlation between the genes is unconsidered, current unsupervised feature selection models may select lots of the redundant genes during the feature selecting due to the over focusing on genes with similar attribute. which may deteriorate the clustering performance of the model. To tackle this problem, we propose an adaptive feature selection model here in which reconstructed coefficient matrix with additional constraint is introduced to transform original data of high dimensional space into a low-dimensional space meanwhile to prevent over focusing on genes with similar attribute. Moreover, Alternative Optimization (AO) is also proposed to handle the nonconvex optimization induced by solving the proposed model. The experimental results on four different cancer datasets show that the proposed model is superior to existing models in the aspects such as clustering accuracy and sparsity of selected genes.

    Citation: Wenkui Zheng, Guangyao Zhang, Chunling Fu, Bo Jin. An adaptive feature selection algorithm based on MDS with uncorrelated constraints for tumor gene data classification[J]. Mathematical Biosciences and Engineering, 2023, 20(4): 6652-6665. doi: 10.3934/mbe.2023286

    Related Papers:

  • The developing of DNA microarray technology has made it possible to study the cancer in view of the genes. Since the correlation between the genes is unconsidered, current unsupervised feature selection models may select lots of the redundant genes during the feature selecting due to the over focusing on genes with similar attribute. which may deteriorate the clustering performance of the model. To tackle this problem, we propose an adaptive feature selection model here in which reconstructed coefficient matrix with additional constraint is introduced to transform original data of high dimensional space into a low-dimensional space meanwhile to prevent over focusing on genes with similar attribute. Moreover, Alternative Optimization (AO) is also proposed to handle the nonconvex optimization induced by solving the proposed model. The experimental results on four different cancer datasets show that the proposed model is superior to existing models in the aspects such as clustering accuracy and sparsity of selected genes.



    加载中


    [1] S. M. Kopka, A. D. Long, E. T. Ito, L. Tolleri, M. M. Riehle, E. S. Paegle, et al., Global gene expression profiling in Escherichia coli K12: The effects of integration host factor, J. Biol. Chem., 275 (2000), 29672–29684. https://doi.org/10.1074/jbc.M213060200 doi: 10.1074/jbc.M213060200
    [2] M. Berta, J. M. Renes, M. M. Wilde, Identifying the information gain of a quantum measurement, IEEE Trans. Inform. Theory, 60 (2014), 7987–8006. https://doi.org/10.1109/TIT.2014.2365207 doi: 10.1109/TIT.2014.2365207
    [3] T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. P. Mesiroy, et al., Molecular classification of cancer: class discovery and class prediction by gene expression monitoring, Science, 286 (1999), 531–537. https://doi.org/10.1126/science.286.5439.531 doi: 10.1126/science.286.5439.531
    [4] H. Peng, F. Long, C. Ding, Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy, IEEE Trans. Pattern Anal. Mach. Intell, , 27 (2005), 1226–1238. https://doi.org/10.1109/TPAMI.2005.159 doi: 10.1109/TPAMI.2005.159
    [5] L. Y. Li, Z. P. Liu, Biomarker discovery for predicting spontaneous preterm birth from gene expression data by regularized logistic regression, Comput. Struct. Biotechnol. J., 18 (2020), 3434–3446. https://doi.org/10.1016/j.csbj.2020.10.028 doi: 10.1016/j.csbj.2020.10.028
    [6] Z. Zhao, H. Liu, Spectral feature selection for supervised and un-supervised Learning, in Proceedings of the 24th international conference on Machine learning, 227 (2007), 1151–1157. https://doi.org/10.1145/1273496.1273641
    [7] Y. Yang, H. T. Shen, Z. Ma, Z. Huang, X. Zhou, $L_2, 1$-Norm regularized discrimiNative feature selection for unsupervised learning, in Proceedings of the 22nd International joint Conference on Artificial Intelligence, (2011), 1589–1594. https://doi.org/10.5591/978-1-57735-516-8/IJCAI11-267
    [8] Z. Li, Y. Yang, J. Liu, X. Zhou, H. Lu, Unsupervised feature selection using nonnegative spectral analysis, in Proceedings of the Twenty-Sixth AAAI Conference on Artificial Interlligence, 26 (2012), 1026–1032. https://doi.org/10.1609/aaai.v26i1.8289
    [9] C. P. Hou, F. P. Nie, D. Y. Yi, Y. Wu, Feature selection via joint embedding Learning and sparse regression, in Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, (2011), 1324–1229. https://doi.org/10.5591/978-1-57735-516-8/IJCAI11-224
    [10] L. Du, Y. D. Shen, Unsupervised feature selection with adaptive structure learning, in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (2015), 209–218. https://doi.org/10.1145/2783258.2783345
    [11] B. Jin, C. L. Fu, Y. Jin, W. Yang, S. B. Li, G. Y. Zhang, et al., An adaptive unsupervised feature selection algorithm based on MDS for tumor gene data classification, Sensors, 21 (2021), 3627. https://doi.org/10.3390/s21113627 doi: 10.3390/s21113627
    [12] X. Y. Xu, X. Wu, F. L. Wei, W. Zhong, F. P. Nie, A general framework for feature selection under orthogonal regression with global redundancy minimization, IEEE Trans. Knowl. Data Eng., 34 (2021), 5056–5069. https://doi.org/10.1109/TKDE.2021.3059523 doi: 10.1109/TKDE.2021.3059523
    [13] L. X. Li, H. Zhang, R. Zhang, Y. Liu, Generalized uncorrelated regression with adaptive graph for unsupervised feature selection, IEEE Trans. Neural Networks Learn. Syst., 30 (2019), 1587–1595. https://doi.org/10.1109/TNNLS.2018.2868847 doi: 10.1109/TNNLS.2018.2868847
    [14] M. Yang, L. Zhang, X. C. Feng, D. Zhang, Sparse representation based fisher discrimination dictionary learning for image classification, International Journal of Computer Vision, 109 (2014), 209–232. https://doi.org/10.1007/s11263-014-0722-8 doi: 10.1007/s11263-014-0722-8
    [15] S. L. Peng, Y. Yang, W. Liu, F. Li, X. K. Liao, Discriminant projection shared dictionary learning for classification of tumors using gene expression data, IEEE/ACM Trans. Comput. Biol. Bioinforma., 18 (2021), 1464–1473. https://doi.org/10.1109/TCBB.2019.2950209 doi: 10.1109/TCBB.2019.2950209
    [16] J. Huang, F. P. Nie, H. Huang, C. Ding, Robust manifold nonnegative matrix factorization, ACM Trans. Knowl. Discovery Data, 8 (2014), 1–21. https://doi.org/10.1145/2601434 doi: 10.1145/2601434
    [17] R. Zhang, X. L. Li, Unsupervised feature selection via data reconstruction and side information, IEEE Trans. Image Process., 29 (2020), 8097–8106. https://doi.org/10.1109/TIP.2020.3011253 doi: 10.1109/TIP.2020.3011253
    [18] A. Strehl, J. Ghosh, Cluster ensembles-A knowledge reuse framework for combining multiple partitions, J. Mach. Learn. Res., 3 (2020), 583–617. https://doi.org/10.1162/153244303321897735 doi: 10.1162/153244303321897735
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1764) PDF downloads(71) Cited by(0)

Article outline

Figures and Tables

Figures(4)  /  Tables(4)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog