Research article Special Issues

Efficient multi-omics clustering with bipartite graph subspace learning for cancer subtype prediction

  • Received: 30 August 2024 Revised: 24 October 2024 Accepted: 01 November 2024 Published: 08 November 2024
  • Due to the complex nature and highly heterogeneous of cancer, as well as different pathogenesis and clinical features among different cancer subtypes, it was crucial to identify cancer subtypes in cancer diagnosis, prognosis, and treatment. The rapid developments of high-throughput technologies have dramatically improved the efficiency of collecting data from various types of omics. Also, integrating multi-omics data related to cancer occurrence and progression can lead to a better understanding of cancer pathogenesis, subtype prediction, and personalized treatment options. Therefore, we proposed an efficient multi-omics bipartite graph subspace learning anchor-based clustering (MBSLC) method to identify cancer subtypes. In contrast, the bipartite graph intended to learn cluster-friendly representations. Experiments showed that the proposed MBSLC method can capture the latent spaces of multi-omics data effectively and showed superiority over other state-of-the-art methods for cancer subtype analysis. Moreover, the survival and clinical analyses further demonstrated the effectiveness of MBSLC. The code and datasets of this paper can be found in

    Citation: Shuwei Zhu, Hao Liu, Meiji Cui. Efficient multi-omics clustering with bipartite graph subspace learning for cancer subtype prediction[J]. Electronic Research Archive, 2024, 32(11): 6008-6031. doi: 10.3934/era.2024279

    Related Papers:

  • Due to the complex nature and highly heterogeneous of cancer, as well as different pathogenesis and clinical features among different cancer subtypes, it was crucial to identify cancer subtypes in cancer diagnosis, prognosis, and treatment. The rapid developments of high-throughput technologies have dramatically improved the efficiency of collecting data from various types of omics. Also, integrating multi-omics data related to cancer occurrence and progression can lead to a better understanding of cancer pathogenesis, subtype prediction, and personalized treatment options. Therefore, we proposed an efficient multi-omics bipartite graph subspace learning anchor-based clustering (MBSLC) method to identify cancer subtypes. In contrast, the bipartite graph intended to learn cluster-friendly representations. Experiments showed that the proposed MBSLC method can capture the latent spaces of multi-omics data effectively and showed superiority over other state-of-the-art methods for cancer subtype analysis. Moreover, the survival and clinical analyses further demonstrated the effectiveness of MBSLC. The code and datasets of this paper can be found in


    [1] J. Ferlay, M. Ervik, F. Lam, M. Colombet, L. Mery, M. Piñeros, et al., Global Cancer Observatory: Cancer Today, Lyon: International Agency for Research on Cancer, 2020. Available from:
    [2] K. A. Hoadley, C. Yau, D. M. Wolf, A. D. Cherniack, D. Tamborero, S. Ng, et al., Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin, Cell, 158 (2014), 929–944. doi: 10.1016/j.cell.2014.06.049
    [3] D. Sun, A. Li, B. Tang, M. Wang, Integrating genomic data and pathological images to effectively predict breast cancer clinical outcome, Comput. Methods Programs Biomed., 161 (2018), 45–53. doi: 10.1016/j.cmpb.2018.04.008
    [4] T. Wang, W. Shao, Z. Huang, H. Tang, J. Zhang, Z. Ding, et al., Mogonet integrates multi-omics data using graph convolutional networks allowing patient classification and biomarker identification, Nat. Commun., 12 (2021), 3445. doi: 10.1038/s41467-021-23774-w
    [5] J. N. Weinstein, E. A. Collisson, G. B. Mills, K. R. Shaw, B. A. Ozenberger, K. Ellrott, et al., The cancer genome atlas pan-cancer analysis project, Nat. Genet., 45 (2013), 1113–1120. doi: 10.1038/ng.2764
    [6] J. Zhang, R. Bajari, D. Andric, F. Gerthoffert, A. Lepsa, H. Nahal-Bose, et al., The international cancer genome consortium data portal, Nat. Biotechnol., 37 (2019), 367–369. doi: 10.1038/s41587-019-0055-9
    [7] X. Liu, Y. Tao, Z. Cai, P. Bao, H. Ma, K. Li, et al., Pathformer: a biological pathway informed transformer for disease diagnosis and prognosis using multi-omics data, Bioinformatics, 40 (2024), btae316. doi: 10.1093/bioinformatics/btae316
    [8] J. Zhao, B. Zhao, X. Song, C. Lyu, W. Chen, Y. Xiong, et al., Subtype-DCC: decoupled contrastive clustering method for cancer subtype identification based on multi-omics data, Briefings Bioinf., 24 (2023), bbad025. doi: 10.1093/bib/bbad025
    [9] S. Zhu, W. Wang, W. Fang, M. Cui, Autoencoder-assisted latent representation learning for survival prediction and multi-view clustering on multi-omics cancer subtyping, Math. Biosci. Eng., 20 (2023), 21098–21119. doi: 10.3934/mbe.2023933
    [10] X. Ye, T. Shi, Y. Cui, T. Sakurai, Interactive gene identification for cancer subtyping based on multi-omics clustering, Methods, 211 (2023), 61–67. doi: 10.1016/j.ymeth.2023.02.005
    [11] M. Lovino, V. Randazzo, G. Ciravegna, P. Barbiero, E. Ficarra, G. Cirrincione, A survey on data integration for multi-omics sample clustering, Neurocomputing, 488 (2022), 494–508. doi: 10.1016/j.neucom.2021.11.094
    [12] D. Wu, D. Wang, M. Q. Zhang, J. Gu, Fast dimension reduction and integrative clustering of multi-omics data using low-rank approximation: application to cancer molecular classification, BMC Genomics, 16 (2015), 1–10. doi: 10.1186/s12864-015-2223-8
    [13] X. Ye, W. Zhang, Y. Futamura, T. Sakurai, Detecting interactive gene groups for single-cell rna-seq data based on co-expression network analysis and subgraph learning, Cells, 9 (2020), 1938. doi: 10.3390/cells9091938
    [14] S. Zhu, L. Xu, Many-objective fuzzy centroids clustering algorithm for categorical data, Expert Syst. Appl., 96 (2018), 230–248. doi: 10.1016/j.eswa.2017.12.013
    [15] S. Zhu, L. Xu, E. D. Goodman, Hierarchical topology-based cluster representation for scalable evolutionary multiobjective clustering, IEEE Trans. Cybern., 52 (2022), 9846–9860. doi: 10.1109/TCYB.2021.3081988
    [16] B. Yang, T. T. Xin, S. M. Pang, M. Wang, Y. J. Wang, Deep subspace mutual learning for cancer subtypes prediction, Bioinformatics, 37 (2021), 3715–3722. doi: 10.1093/bioinformatics/btab625
    [17] J. M. Nigro, A. Misra, L. Zhang, I. Smirnov, H. Colman, C. Griffin, et al., Integrated array-comparative genomic hybridization and expression array profiles identify clinically relevant molecular subtypes of glioblastoma, Cancer Res., 65 (2005), 1678–1686. doi: 10.1158/0008-5472.CAN-04-2921
    [18] B. Wang, A. M. Mezlini, F. Demir, M. Fiume, Z. Tu, M. Brudno, et al., Similarity network fusion for aggregating data types on a genomic scale, Nat. Methods, 11 (2014), 333–337. doi: 10.1038/nmeth.2810
    [19] N. K. Speicher, N. Pfeifer, Integrating different data types by regularized unsupervised multiple kernel learning with application to cancer subtype discovery, Bioinformatics, 31 (2015), i268–i275. doi: 10.1093/bioinformatics/btv244
    [20] C. Liang, M. Shang, J. Luo, Cancer subtype identification by consensus guided graph autoencoders, Bioinformatics, 37 (2021), 4779–4786. doi: 10.1093/bioinformatics/btab535
    [21] N. Rappoport, R. Shamir, NEMO: cancer subtyping by integration of partial multi-omic data, Bioinformatics, 35 (2019), 3348–3356. doi: 10.1093/bioinformatics/btz058
    [22] W. Wang, X. Zhang, D. Q. Dai, Defusion: a denoised network regularization framework for multi-omics integration, Briefings Bioinf., 22 (2021), bbab057. doi: 10.1093/bib/bbab057
    [23] R. Argelaguet, B. Velten, D. Arnol, S. Dietrich, T. Zenz, J. C. Marioni, et al., Multi-omics factor analysis—a framework for unsupervised integration of multi-omics data sets, Mol. Syst. Biol., 14 (2018), e8124. doi: 10.15252/msb.20178124
    [24] B. Yang, T. T. Xin, S. M. Pang, M. Wang, Y. J. Wang, Deep subspace mutual learning for cancer subtypes prediction, Bioinformatics, 37 (2021), 3715–3722. doi: 10.1093/bioinformatics/btab625
    [25] X. Ye, Y. Shang, T. Shi, W. Zhang, T. Sakurai, Multi-omics clustering for cancer subtyping based on latent subspace learning, Comput. Biol. Med., 164 (2023), 107223. doi: 10.1016/j.compbiomed.2023.107223
    [26] Z. Chen, X. J. Wu, T. Xu, J. Kittler, Fast self-guided multi-view subspace clustering, IEEE Trans. Image Process., 32 (2023), 6514–6525. doi: 10.1109/TIP.2023.3261746
    [27] K. K. Sharma, A. Seal, Multi-view spectral clustering for uncertain objects, Inf. Sci., 547 (2021), 723–745. doi: 10.1016/j.ins.2020.08.080
    [28] H. Xu, X. Zhang, W. Xia, Q. Gao, X. Gao, Low-rank tensor constrained co-regularized multi-view spectral clustering, Neural Networks, 132 (2020), 245–252. doi: 10.1016/j.neunet.2020.08.019
    [29] Z. Huang, J. T. Zhou, H. Zhu, C. Zhang, J. Lv, X. Peng, Deep spectral representation learning from multi-view data, IEEE Trans. Image Process., 30 (2021), 5352–5362. doi: 10.1109/TIP.2021.3083072
    [30] X. Cai, D. Huang, G. Y. Zhang, C. D. Wang, Seeking commonness and inconsistencies: A jointly smoothed approach to multi-view subspace clustering, Inf. Fusion, 91 (2023), 364–375. doi: 10.1016/j.inffus.2022.10.020
    [31] R. Vidal, Subspace clustering, IEEE Signal Process Mag., 28 (2011), 52–68. doi: 10.1109/MSP.2010.939739
    [32] G. Guo, H. Wang, D. Bell, Y. Bi, K. Greer, KNN model-based approach in classification, in On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE: OTM Confederated International Conferences, CoopIS, DOA, and ODBASE 2003, Catania, Sicily, Italy, November 3–7, 2003. Proceedings, Springer, (2003), 986–996.
    [33] Z. Kang, W. Zhou, Z. Zhao, J. Shao, M. Han, Z. Xu, Large-scale multi-view subspace clustering in linear time, in Proceedings of the AAAI Conference on Artificial Intelligence, 34 (2020), 4412–4419.
    [34] Y. Li, F. Nie, H. Huang, J. Huang, Large-scale multi-view spectral clustering via bipartite graph, in Proceedings of the AAAI Conference on Artificial Intelligence, 29 (2015), 2750–2756.
    [35] S. Zhu, L. Xu, E. D. Goodman, Evolutionary multi-objective automatic clustering enhanced with quality metrics and ensemble strategy, Knowledge-Based Syst., 188 (2020), 1–21. doi: 10.1016/j.knosys.2019.105018
    [36] K. Krishna, M. N. Murty, Genetic k-means algorithm, IEEE Trans. Syst. Man Cybern. Part B Cybern., 29 (1999), 433–439. doi: 10.1109/3477.764879
    [37] W. Xia, Q. Gao, Q. Wang, X. Gao, C. Ding, D. Tao, Tensorized bipartite graph learning for multi-view clustering, IEEE Trans. Pattern Anal. Mach. Intell., 45 (2022), 5187–5202. doi: 10.1109/TPAMI.2022.3187976
    [38] I. Jolliffe, Principal component analysis, in Encyclopedia of Statistics in Behavioral Science, John Wiley and Sons Ltd, New York, (2005), 1580–1584.
    [39] C. R. John, D. Watson, M. R. Barnes, C. Pitzalis, M. J. Lewis, Spectrum: fast density-aware spectral clustering for single and multi-omic data, Bioinformatics, 36 (2020), 1159–1166. doi: 10.1101/636639
    [40] T. Xu, T. D. Le, L. Liu, N. Su, R. Wang, B. Sun, et al., CancerSubtypes: an R/Bioconductor package for molecular cancer subtype identification, validation and visualization, Bioinformatics, 33 (2017), 3131–3133. doi: 10.1093/bioinformatics/btx378
    [41] D. Leng, L. Zheng, Y. Wen, Y. Zhang, L. Wu, J. Wang, et al., A benchmark study of deep learning-based multi-omics data fusion methods for cancer, Genome Biol., 23 (2022), 171. doi: 10.1186/s13059-022-02739-2
    [42] F. E. Harrell, R. M. Califf, D. B. Pryor, K. L. Lee, R. A. Rosati, Evaluating the yield of medical tests, JAMA, 247 (1982), 2543–2546. doi: 10.1001/jama.1982.03320430047030
    [43] L. Van der Maaten, G. Hinton, Visualizing data using t-SNE, J. Mach. Learn. Res., 9 (2008), 11.
    [44] C. Zhou, E. Martinez, D. Di Marcantonio, N. Solanki-Patel, T. Aghayev, S. Peri, et al., JUN is a key transcriptional regulator of the unfolded protein response in acute myeloid leukemia, Leukemia, 31 (2017), 1196–1205. doi: 10.1038/leu.2016.329
    [45] G. H. Su, W. Hilgers, M. C. Shekher, D. J. Tang, C. J. Yeo, R. H. Hruban, et al., Alterations in pancreatic, biliary, and breast carcinomas support MKK4 as a genetically targeted tumor suppressor gene, Cancer Res., 58 (1998), 2339–2342.
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (
通讯作者: 陈斌,
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索


Article views(581) PDF downloads(39) Cited by(0)

Article outline

Figures and Tables

Figures(12)  /  Tables(6)

Other Articles By Authors


DownLoad:  Full-Size Img  PowerPoint
