Biomarkers plays an important role in the prediction and diagnosis of cancers. Therefore, it is urgent to design effective methods to extract biomarkers. The corresponding pathway information of the microarray gene expression data can be obtained from public database, which makes possible to identify biomarkers based on pathway information and has been attracted extensive attention. In the most existing methods, all the member genes in the same pathway are regarded as equally important for inferring pathway activity. However, the contribution of each gene should be different in the process of inferring pathway activity. In this research, an improved multi-objective particle swarm optimization algorithm with penalty boundary intersection decomposition mechanism (IMOPSO-PBI) has been proposed to quantify the relevance of each gene in pathway activity inference. In the proposed algorithm, two optimization objectives namely t-score and z-score respectively has been introduced. In addition, in order to solve the problem that optimal set with poor diversity in the most multi-objective optimization algorithms, an adaptive mechanism for adjusting penalty parameters based on PBI decomposition has been introduced. The performance of the proposed IMOPSO-PBI approach compared with some existing methods on six gene expression datasets has been given. To verify the effectiveness of the proposed IMOPSO-PBI algorithm, experiments were carried out on six gene datasets and the results has been compared with the existing methods. The comparative experiment results show that the proposed IMOPSO-PBI method has a higher classification accuracy and the extracted feature genes are verified possess biological significance.
Citation: Shuaiqun Wang, Tianshun Zhang, Wei Kong, Gen Wen, Yaling Yu. An improved MOPSO approach with adaptive strategy for identifying biomarkers from gene expression dataset[J]. Mathematical Biosciences and Engineering, 2023, 20(2): 1580-1598. doi: 10.3934/mbe.2023072
Biomarkers plays an important role in the prediction and diagnosis of cancers. Therefore, it is urgent to design effective methods to extract biomarkers. The corresponding pathway information of the microarray gene expression data can be obtained from public database, which makes possible to identify biomarkers based on pathway information and has been attracted extensive attention. In the most existing methods, all the member genes in the same pathway are regarded as equally important for inferring pathway activity. However, the contribution of each gene should be different in the process of inferring pathway activity. In this research, an improved multi-objective particle swarm optimization algorithm with penalty boundary intersection decomposition mechanism (IMOPSO-PBI) has been proposed to quantify the relevance of each gene in pathway activity inference. In the proposed algorithm, two optimization objectives namely t-score and z-score respectively has been introduced. In addition, in order to solve the problem that optimal set with poor diversity in the most multi-objective optimization algorithms, an adaptive mechanism for adjusting penalty parameters based on PBI decomposition has been introduced. The performance of the proposed IMOPSO-PBI approach compared with some existing methods on six gene expression datasets has been given. To verify the effectiveness of the proposed IMOPSO-PBI algorithm, experiments were carried out on six gene datasets and the results has been compared with the existing methods. The comparative experiment results show that the proposed IMOPSO-PBI method has a higher classification accuracy and the extracted feature genes are verified possess biological significance.
[1] | M. Mandal, A. Mukhopadhyay, A graph-theoretic approach for identifying non-redundant and relevant gene markers from microarray data using multiobjective binary PSO, PLoS One, 9 (2014), 13. https://doi.org/ 10.1371/journal.pone.0090949 doi: 10.1371/journal.pone.0090949 |
[2] | S. Bandyopadhyay, S. Mallik, A. Mukhopadhyay, A survey and comparative study of statistical tests for identifying differential expression from microarray data, IEEE/ACM Trans. Comput. Biol. Bioinf., 11 (2014), 95–115. https://doi.org/ 10.1109/TCBB.2013.147 doi: 10.1109/TCBB.2013.147 |
[3] | A. Mukhopadhyay, M. Mandal, Identifying non-redundant gene markers from microarray Data: A multiobjective variable length PSO-based approach, IEEE/ACM Trans. Comput. Biol. Bioinf., 11 (2014), 1170–1183. https://doi.org/10.1109/TCBB.2014.2323065 doi: 10.1109/TCBB.2014.2323065 |
[4] | Y. Saeys, I. Inza, P. Larraaga, A review of feature selection techniques in bioinformatics, Bioinformatics, 23 (2007), 2507–2517. https://doi.org/10.1093/bioinformatics/btm344 doi: 10.1093/bioinformatics/btm344 |
[5] | S. Bouatmane, M. A. Roula, A. Bouridane, S. Al-Maadeed, Round-Robin sequential forward selection algorithm for prostate cancer classification and diagnosis using multispectral imagery, Mach. Vision Appl., 22 (2011), 865–878. https://doi.org/10.1007/s00138-010-0292-x doi: 10.1007/s00138-010-0292-x |
[6] | S. Ma, M. R. Kosorok, Identification of differential gene pathways with principal componentanalysis, Bioinformatics, 25 (2009), 882–889. https://doi.org/10.1093/bioinformatics/btp085 doi: 10.1093/bioinformatics/btp085 |
[7] | J. J. Su, B. J. Yoon, E. R. Dougherty, Accurate and reliable cancer classification based on probabilistic inference of pathway Activity, PLoS One, 4 (2009), 10. https://doi.org/10.1371/journal.pone.0008161 doi: 10.1371/journal.pone.0008161 |
[8] | N. M. Borisov, N. V. Terekhanova, A. M. Aliper, L. S. Venkova, P. Y. Smirnov, S. Roumiantsev, et al., Signaling pathways activation profiles make better markers of cancer than expression of individual genes, Oncotarget, 5 (2014), 10198–10205. https://doi.org/10.18632/oncotarget.2548 doi: 10.18632/oncotarget.2548 |
[9] | M. Mandal, J. Mondal, A. Mukhopadhyay, A PSO-based approach for pathway marker identification from gene expression data, IEEE Trans. Nanobiosci., 14 (2015), 591–597. https://doi.org/10.1109/TNB.2015.2425471 doi: 10.1109/TNB.2015.2425471 |
[10] | P. Dutta, S. Saha, S. Naskar, A multi-objective based PSO approach for inferring pathway activity utilizing protein interactions, Multimed. Tools Appl., 80 (2021), 30283–30303. https://doi.org/10.1007/s11042-020-09269-8 doi: 10.1007/s11042-020-09269-8 |
[11] | A. Trivedi, D. Srinivasan, K. Sanyal, A. Ghosh, A survey of multiobjective evolutionary algorithms based on decomposition, IEEE Trans. Evol. Comput., 21 (2017), 440–462. https://doi.org/10.1109/tevc.2016.2608507 doi: 10.1109/tevc.2016.2608507 |
[12] | S. X. Yang, M. Q. Li, X. H. Liu, J. H. Zheng, A grid-based evolutionary algorithm for many-objective optimization, IEEE Trans. Evol. Comput., 17 (2013), 721–736. https://doi.org/10.1109/tevc.2012.2227145 doi: 10.1109/tevc.2012.2227145 |
[13] | C. H. Liang, C. Y. Chung, K. P. Wong, X. Z. Duan, Parallel optimal reactive power flow based on cooperative co-evolutionary differential evolution and power system decomposition, IEEE Trans. Power Syst., 22 (2007), 249–257. https://doi.org/10.1109/tpwrs.2006.887889 doi: 10.1109/tpwrs.2006.887889 |
[14] | Z. H. Zhan, J. J. Li, J. N. Cao, J. Zhang, H. S. H. Chung, Y. H. Shi, Multiple populations for multiple objectives: A coevolutionary technique for solving multiobjective optimization problems, IEEE Trans. Cybern., 43 (2013), 445–463. https://doi.org/10.1109/tsmcb.2012.2209115 doi: 10.1109/tsmcb.2012.2209115 |
[15] | Y. C. Yang, T. X. Zhang, W. Yi, L. J. Kong, X. L. Li, B. Wang, et al., Deployment of multistatic radar system using multi-objective particle swarm optimisation, IET Radar Sonar Navig., 12 (2018), 485–493. https://doi.org/10.1049/iet-rsn.2017.0351 doi: 10.1049/iet-rsn.2017.0351 |
[16] | J. F. Qiao, H. B. Zhou, C. L. Yang, S. X. Yang, A decomposition-based multiobjective evolutionary algorithm with angle-based adaptive penalty, Appl. Soft. Comput., 74 (2019), 190–205. https://doi.org/10.1016/j.asoc.2018.10.028 doi: 10.1016/j.asoc.2018.10.028 |
[17] | Y. Xue, B. Xue, M. J. Zhang, Self-adaptive particle swarm optimization for large-scale feature selection in classification, ACM Trans. Knowl. Discov. Data, 13 (2019), 27. https://doi.org/10.1145/3340848 doi: 10.1145/3340848 |
[18] | X. M. He, S. H. Dong, N. Zhao, Research on rush order insertion rescheduling problem under hybrid flow shop based on NSGA-Ⅲ, Int. J. Prod. Res., 58 (2020), 1161–1177. https://doi.org/10.1080/00207543.2019.1613581 doi: 10.1080/00207543.2019.1613581 |
[19] | Z. Zheng, J. Y. Long, X. Q. Gao, Production scheduling problems of steelmaking-continuous casting process in dynamic production environment, J. Iron Steel Res. Int., 24 (2017), 586–594. https://doi.org/10.1016/s1006-706x(17)30089-4 doi: 10.1016/s1006-706x(17)30089-4 |
[20] | C. Dai, Y. P. Wang, M. Ye, A new multi-objective particle swarm optimization algorithm based on decomposition, Inf. Sci., 325 (2015), 541–557. https://doi.org/10.1016/j.ins.2015.07.018 doi: 10.1016/j.ins.2015.07.018 |
[21] | R. Akbari, R. Hedayatzadeh, K. Ziarati, B. Hassanizadeh, A multi-objective artificial bee colony algorithm, Swarm Evol. Comput., 2 (2012), 39–52. https://doi.org/10.1016/j.swevo.2011.08.001 doi: 10.1016/j.swevo.2011.08.001 |
[22] | N. S. Sani, M. Manthouri, F. Farivar, A multi-objective ant colony optimization algorithm for community detection in complex networks, J. Ambient Intell. Humaniz. Comput., 11 (2020), 5–21. https://doi.org/10.1007/s12652-018-1159-7 doi: 10.1007/s12652-018-1159-7 |
[23] | J. Xu, Z. L. Nie, J. J. Zhu, Characterization and selection of probability statistical parameters in random slope pwm based on uniform distribution, IEEE Trans. Power Electron., 36 (2021), 1184–1192. https://doi.org/10.1109/tpel.2020.3004725 doi: 10.1109/tpel.2020.3004725 |
[24] | Q. Z. Lin, Y. P. Ma, J. Y. Chen, Q. L. Zhu, C. A. C. Coello, K. C. Wong, et al., An adaptive immune-inspired multi-objective algorithm with multiple differential evolution strategies, Inf. Sci., 430 (2018), 46–64. https://doi.org/10.1016/j.ins.2017.11.030 doi: 10.1016/j.ins.2017.11.030 |
[25] | A. R. Jordehi, Enhanced leader PSO (ELPSO): A new PSO variant for solving global optimisation problems, Appl. Soft. Comput., 26 (2015), 401–417. https://doi.org/10.1016/j.asoc.2014.10.026 doi: 10.1016/j.asoc.2014.10.026 |
[26] | S. Y. Jiang, S. X. Yang, Y. Wang, X. B. Liu, Scalarizing functions in decomposition-based multiobjective evolutionary algorithms, IEEE Trans. Evol. Comput., 22 (2018), 296–313. https://doi.org/10.1109/tevc.2017.2707980 doi: 10.1109/tevc.2017.2707980 |
[27] | Q. F. Zhang, H. Li, MOEA/D: A multiobjective evolutionary algorithm based on decomposition, IEEE Trans. Evol. Comput., 11 (2007), 712–731. https://doi.org/10.1109/tevc.2007.892759 doi: 10.1109/tevc.2007.892759 |
[28] | S. X. Yang, S. Y. Jiang, Y. Jiang, Improving the multiobjective evolutionary algorithm based on decomposition with new penalty schemes, Soft Comput., 21 (2017), 4677–4691. https://doi.org/10.1007/s00500-016-2076-3 doi: 10.1007/s00500-016-2076-3 |
[29] | Y. R. Zhou, Y. Xiang, Z. F. Chen, J. He, J. H. Wang, A scalar projection and angle-based evolutionary algorithm for many-objective optimization problems, IEEE Trans. Cybern., 49 (2019), 2073–2084. https://doi.org/10.1109/tcyb.2018.2819360 doi: 10.1109/tcyb.2018.2819360 |
[30] | D. W. Huang, B. T. Sherman, R. A. Lempicki, Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources, Nat. Protoc., 4 (2009), 44–57. https://doi.org/10.1038/nprot.2008.211 doi: 10.1038/nprot.2008.211 |
[31] | K. Wang, M. Y. Li, M. Bucan, Pathway-based approaches for analysis of genomewide association studies, Am. J. Hum. Genet., 81 (2007), 1278–1283. https://doi.org/10.1086/522374 doi: 10.1086/522374 |
[32] | P. Baldi, A. D. Long, A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes, Bioinformatics, 17 (2001), 509–519. https://doi.org/10.1093/bioinformatics/17.6.509 doi: 10.1093/bioinformatics/17.6.509 |
[33] | M. Seo, S. Oh, CBFS: High performance feature selection algorithm based on feature clearness, PLoS One, 7 (2012), 10. https://doi.org/10.1371/journal.pone.0040419 doi: 10.1371/journal.pone.0040419 |
[34] | J. Cai, J. W. Luo, S. L. Wang, S. Yang, Feature selection in machine learning: A new perspective, Neurocomputing, 300 (2018), 70–79. https://doi.org/10.1016/j.neucom.2017.11.077 doi: 10.1016/j.neucom.2017.11.077 |