Research article Special Issues

A robust framework for enhancing cardiovascular disease risk prediction using an optimized category boosting model


  • Received: 13 October 2023 Revised: 21 January 2024 Accepted: 23 January 2024 Published: 29 January 2024
  • Cardiovascular disease (CVD) is a leading cause of mortality worldwide, and it is of utmost importance to accurately assess the risk of cardiovascular disease for prevention and intervention purposes. In recent years, machine learning has shown significant advancements in the field of cardiovascular disease risk prediction. In this context, we propose a novel framework known as CVD-OCSCatBoost, designed for the precise prediction of cardiovascular disease risk and the assessment of various risk factors. The framework utilizes Lasso regression for feature selection and incorporates an optimized category-boosting tree (CatBoost) model. Furthermore, we propose the opposition-based learning cuckoo search (OCS) algorithm. By integrating OCS with the CatBoost model, our objective is to develop OCSCatBoost, an enhanced classifier offering improved accuracy and efficiency in predicting CVD. Extensive comparisons with popular algorithms like the particle swarm optimization (PSO) algorithm, the seagull optimization algorithm (SOA), the cuckoo search algorithm (CS), K-nearest-neighbor classification, decision tree, logistic regression, grid-search support vector machine (SVM), grid-search XGBoost, default CatBoost, and grid-search CatBoost validate the efficacy of the OCSCatBoost algorithm. The experimental results demonstrate that the OCSCatBoost model achieves superior performance compared to other models, with overall accuracy, recall, and AUC values of 73.67%, 72.17%, and 0.8024, respectively. These outcomes highlight the potential of CVD-OCSCatBoost for improving cardiovascular disease risk prediction.

    Citation: Zhaobin Qiu, Ying Qiao, Wanyuan Shi, Xiaoqian Liu. A robust framework for enhancing cardiovascular disease risk prediction using an optimized category boosting model[J]. Mathematical Biosciences and Engineering, 2024, 21(2): 2943-2969. doi: 10.3934/mbe.2024131

    Related Papers:

  • Cardiovascular disease (CVD) is a leading cause of mortality worldwide, and it is of utmost importance to accurately assess the risk of cardiovascular disease for prevention and intervention purposes. In recent years, machine learning has shown significant advancements in the field of cardiovascular disease risk prediction. In this context, we propose a novel framework known as CVD-OCSCatBoost, designed for the precise prediction of cardiovascular disease risk and the assessment of various risk factors. The framework utilizes Lasso regression for feature selection and incorporates an optimized category-boosting tree (CatBoost) model. Furthermore, we propose the opposition-based learning cuckoo search (OCS) algorithm. By integrating OCS with the CatBoost model, our objective is to develop OCSCatBoost, an enhanced classifier offering improved accuracy and efficiency in predicting CVD. Extensive comparisons with popular algorithms like the particle swarm optimization (PSO) algorithm, the seagull optimization algorithm (SOA), the cuckoo search algorithm (CS), K-nearest-neighbor classification, decision tree, logistic regression, grid-search support vector machine (SVM), grid-search XGBoost, default CatBoost, and grid-search CatBoost validate the efficacy of the OCSCatBoost algorithm. The experimental results demonstrate that the OCSCatBoost model achieves superior performance compared to other models, with overall accuracy, recall, and AUC values of 73.67%, 72.17%, and 0.8024, respectively. These outcomes highlight the potential of CVD-OCSCatBoost for improving cardiovascular disease risk prediction.



    加载中


    [1] W. B. Kannel, D. Mcgee, T. Gordon, A general cardiovascular risk profile: The Frmingham study, Am. J. Cardiol., 38 (1976), 46–51. https://doi.org/10.1016/0002-9149(76)90061-8 doi: 10.1016/0002-9149(76)90061-8
    [2] R. M. Conroy, K. Pyoral, A. P. Fitzgerald, S. Sans, A. Menotti, G. De Backer, et al., Estimation of ten-year risk of fatal cardiovascular disease in Europe: The SCORE project, Eur. Heart J., 24 (2003), 987–1003. https://doi.org/10.1016/S0195-668X(03)00114-3 doi: 10.1016/S0195-668X(03)00114-3
    [3] C. Hippisley, Derivation and validation of QRISK, a new cardiovascular diseaserisk score for the United Kingdom: Prospective open cohort study, BMJ, 335 (2007), 136. https://doi.org/10.1136/bmj.39261.471806.55 doi: 10.1136/bmj.39261.471806.55
    [4] S. F. Weng, J. Reps, J. Kai, Can machine-learning improve cardiovascular risk prediction using routine clinical data, PLoS ONE, 12 (2017), e0174944. https://doi.org/10.1371/journal.pone.0174944 doi: 10.1371/journal.pone.0174944
    [5] A. C. Dimopoulos, M. Nikolaidou, F. F. Caballero, Machine learning methodologies versus cardiovascular risk scores, in predicting disease risk, BMC Med. Res. Methodol., 18 (2018). https://doi.org/10.1186/s12874-018-0644-1 doi: 10.1186/s12874-018-0644-1
    [6] W. Huang, T. W. Ying, W. L. C. Chin, Application of ensemble machine learning algorithms on lifestyle factors and wearables for cardiovascular risk prediction, Sci. Rep., 12 (2022), 1033. https://doi.org/10.1038/s41598-021-04649-y doi: 10.1038/s41598-021-04649-y
    [7] M. Ordikhani, M. S. Abadeh, C. Prugger, An evolutionary machine learning algorithm for cardiovascular disease risk prediction, PLoS ONE, 17 (2022), e0271723. https://doi.org/10.1371/journal.pone.0271723 doi: 10.1371/journal.pone.0271723
    [8] M. Pal, S. Parija, G. Panda, K. Dhama, R. K. Mohapatra, Risk prediction of cardiovascular disease using machine learning classifiers, Open Med., 17 (2022), 1100–1113. https://doi.org/10.1515/med-2022-0508 doi: 10.1515/med-2022-0508
    [9] L. R. Guarneros-Nolasco, N. A. Cruz-Ramos, G. Alor-Hernández, L. Rodríguez-Mazahua, J. L. Sánchez-Cervantes, Identifying the main risk factors for cardiovascular diseases prediction using machine learning algorithms, Mathematics, 9 (2021), 2537. https://doi.org/10.3390/math9202537 doi: 10.3390/math9202537
    [10] M. M. Ali, B. K. Paul, K. Ahmed, F. M. Bui, J. M. W. Quinn, M. A. Moni, Heart disease prediction using supervised machine learning algorithms: Performance analysis and comparison, Comput. Biol. Med., 136(2021), 104672. https://doi.org/10.1016/j.compbiomed.2021.104672 doi: 10.1016/j.compbiomed.2021.104672
    [11] K. Kanagarathinam, D. Sankaran, R. Manikandan, Machine learning-based risk prediction model for cardiovascular disease using a hybrid dataset, Data Knowl. Eng., 140 (2022), 102042. https://doi.org/10.1016/j.datak.2022.102042 doi: 10.1016/j.datak.2022.102042
    [12] J. M. Sung, I. J. Cho, D. Sung, S. Kim, Development and verification of prediction models for preventing cardiovascular diseases, PLoS ONE, 14 (2019), e0222809. https://doi.org/10.1371/journal.pone.0222809 doi: 10.1371/journal.pone.0222809
    [13] Y. Pan, M. Fu, B. Cheng, X. Tao, J. Guo, Enhanced deep learning assisted convolutional neural network for heart disease prediction on the internet of medical things platform, IEEE Access, 8 (2020), 189503–189512. https://doi.org/10.1109/ACCESS.2020.3026214 doi: 10.1109/ACCESS.2020.3026214
    [14] S. K. Pandey, R. R. Janghel, Automatic detection of arrhythmia from imbalanced ECG database using CNN model with SMOTE, Australas. Phys. Eng. Sci. Med., 42 (2019), 1129–1139. https://doi.org/10.1007/s13246-019-00815-9 doi: 10.1007/s13246-019-00815-9
    [15] L. Ali, A. Rahman, A. Khan, M. Zhou, A. Javeed, J. A. Khan, An automated diagnostic system for heart disease prediction based on χ2 statistical model and optimally configured deep neural network, IEEE Access, 7 (2019), 34938–34945. https://doi.org/10.1109/ACCESS.2019.2904800 doi: 10.1109/ACCESS.2019.2904800
    [16] I. D. Mienye, Y. Sun, Z. Wang, An improved ensemble learning approach for the prediction of heart disease risk, Inf. Med. Unlocked, 20 (2020), 100402. https://doi.org/10.1016/j.imu.2020.100402 doi: 10.1016/j.imu.2020.100402
    [17] S. Pandya, T. R. Gadekallu, P. K. Reddy, W. Wang, M. Alazab, InfusedHeart: A novel knowledge-infused learning framework for diagnosis of cardiovascular events, IEEE Trans. Comput. Soc. Syst., 2022 (2022). https://doi.org/10.1109/TCSS.2022.3151643 doi: 10.1109/TCSS.2022.3151643
    [18] P. Srinivas, R. Katarya, HyOPTXg: OPTUNA hyper-parameter optimization framework for predicting cardiovascular disease using XGBoost, Biomed. Signal Process. Control, 73 (2022), 103456. https://doi.org/10.1016/j.bspc.2021.103456 doi: 10.1016/j.bspc.2021.103456
    [19] V. Baviskar, M. Verma, P. Chatterjee, G. Singal, T. R. Gadekallu, Optimization using internet of agent based stacked sparse autoencoder model for heart disease prediction, Exp. Syst., 2023 (2023), e13359. https://doi.org/10.1111/exsy.13359 doi: 10.1111/exsy.13359
    [20] X. Wei, C. Rao, X. Xiao, L. Chen, M. Goh, Risk assessment of cardiovascular disease based on SOLSSA-CatBoost model, Exp. Syst. Appl., 219 (2023), 119648. https://doi.org/10.1016/j.eswa.2023.119648 doi: 10.1016/j.eswa.2023.119648
    [21] A. S. Kumar, R. Rekha, An improved hawks optimizer based learning algorithms for cardiovascular disease prediction, Biomed. Signal Process. Control, 81 (2023), 104442. https://doi.org/10.1016/j.bspc.2022.104442 doi: 10.1016/j.bspc.2022.104442
    [22] X. S. Yang, Cuckoo search via Lxevy flights, in 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC), (2009), 210–214. https://doi.org/10.1109/NABIC.2009.5393690
    [23] H. R. Tizhoosh, Opposition-based learning: a new scheme for machine intelligence, in Proceedings of IEEE International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce(CIMCA-IAWTIC06, (2005), 695–701. https://doi.org/10.1109/cimca.2005.1631345
    [24] A. A. Ewees, A. E. Mohamed, E. H. Houssein, Improved grasshopper optimization algorithm using opposition-based learning, Exp. Syst. Appl., 112 (2018), 156–172. https://doi.org/10.1016/j.eswa.2018.06.023 doi: 10.1016/j.eswa.2018.06.023
    [25] X. Yu, W. Xu, C. Li, Opposition-based learning grey wolf optimizer for global optimization, Knowl.-Based Syst., 226 (2021), 107139. https://doi.org/10.1016/j.knosys.2021.107139 doi: 10.1016/j.knosys.2021.107139
    [26] M. Khishe, Greedy opposition-based learning for chimp optimization algorithm, Artif. Intell. Rev., 56 (2022), 7633–7663. https://doi.org/10.1007/s10462-022-10343-w doi: 10.1007/s10462-022-10343-w
    [27] M. Imran, S. Khan, H. Hlavacs, Intrusion detection in networks using cuckoo search optimization, Soft Comput., 26 (2022), 10651–10663. https://doi.org/10.1007/s00500-022-06798-2 doi: 10.1007/s00500-022-06798-2
    [28] B. Jia, B. Yu, Q. Wu, Adaptive affinity propagation method based on improved cuckoo search, Knowl.-Based Syst., 111 (2016), 27–35. https://doi.org/10.1016/j.knosys.2016.07.039 doi: 10.1016/j.knosys.2016.07.039
    [29] S. Chakraborty, K. Mali, Fuzzy and elitist cuckoo search based microscopic image segmentation approach, Appl. Soft Comput., 130 (2022), 109671. https://doi.org/10.1016/j.asoc.2022.109671 doi: 10.1016/j.asoc.2022.109671
    [30] P. N. Maddaiah, P. P. Narayanan, An improved Cuckoo search algorithm for optimization of artificial neural network training, Neural Process. Lett., 2023 (2023), 1–28. https://doi.org/10.1007/s11063-023-11411-0 doi: 10.1007/s11063-023-11411-0
    [31] R. Eberhart, K. James, A new optimizer using particle swarm theory, in Proceedings of the Sixth International Symposium on Micro Machine and Human Science, (1995), 39–43. https://doi.org/10.1109/mhs.1995.494215
    [32] G. Dhiman, V. Kumar, Seagull optimization algorithm: Theory and its applications for largescale industrial engineering problems, Knowl.-Based Syst., 165 (2019), 169–196. https://doi.org/10.1016/j.knosys.2018.11.024 doi: 10.1016/j.knosys.2018.11.024
    [33] J. Maiga, G. G. Hungilo, Comparison of machine learning models in prediction of cardiovascular disease using health record data, in 2019 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS), (2019), 45–48. https://doi.org/10.1109/ICIMCIS48181.2019.8985205
    [34] A. Nikam, S. Bhandari, A. Mhaske, S. Mantri, Cardiovascular disease prediction using machine learning models, in 2020 IEEE Pune Section International Conference (PuneCon), (2020), 22–27. https://doi.org/10.1109/PuneCon50868.2020.9362367
    [35] J. C. T. Arroyo, A. J. P. Delima, An optimized neural network using genetic algorithm for cardiovascular disease prediction, J. Adv. Inf. Technol., 13 (2022), 95–99. https://doi.org/10.12720/jait.13.1.95-99 doi: 10.12720/jait.13.1.95-99
    [36] M. Peng, F. Hou, Z. Cheng, T. Shen, K. Liu, C. Zhao, et al., A cardiovascular disease risk score model based on high contribution characteristics, Appl. Sci., 13 (2023), 893. https://doi.org/10.3390/app13020893 doi: 10.3390/app13020893
    [37] T. B. Olesen, M. Pareek, The influence of age and sex on the prognostic importance of traditional cardiovascular risk factors, selected circulating biomarkers and other markers of subclinical cardiovascular damage, Curr. Opin. Cardiol., 38 (2023), 21–31. https://doi.org/10.1097/hco.0000000000001005 doi: 10.1097/hco.0000000000001005
    [38] E. Harold, P. R. Bays, E. E. Taub, Ten things to know about ten cardiovascular disease risk factors, Am. J. Prev. Cardiol., 5 (2021), 100149. https://doi.org/10.1016/j.ajpc.2021.100149 doi: 10.1016/j.ajpc.2021.100149
    [39] C. Phanish, B. Radhika, Assessing the risk factors associated with cardiovascular disease, Eur. J. Prev. Cardiol., 25 (2018), 932–933. https://doi.org/10.1177/2047487318778652 doi: 10.1177/2047487318778652
    [40] A. Arafa, H. H. Lee, E. S. Eshak, K. Shirai, K. Liu, J. Li, et al., Modifiable risk factors for cardiovascular disease in Korea and Japan, Korean Circ. J., 51 (2021), 643–655. https://doi.org/10.4070/kcj.2021.0121 doi: 10.4070/kcj.2021.0121
    [41] M. George, K. George, T. Athanasios, Cardiovascular disease in Greece; the latest evidence on risk factors, Hell. J. Cardiol., 60 (2019), 271–275. https://doi.org/10.1016/j.hjc.2018.09.006 doi: 10.1016/j.hjc.2018.09.006
    [42] P. Zhao, H. Li, Opposition-based Cuckoo search algorithm for optimization problems, in 2012 Fifth International Symposium on Computational Intelligence and Design, (2012), 344–347. https://doi.org/10.1109/ISCID.2012.93
    [43] N. A. Baghdadi, S. M. F. Abdelaliem, A. Malki, I. Gad, A. Ewis, E. Atlam, Advanced machine learning techniques for cardiovascular disease early detection and diagnosis, J. Big Data, 10 (2023). https://doi.org/10.1186/s40537-023-00817-1 doi: 10.1186/s40537-023-00817-1
    [44] H. Huan, F. Zhen, L. Hai, J. Cheng, J. Lyu, Y. Zhang, et al., Gene function and cell surface protein association analysis based on single-cell multiomics data, Comput. Biol. Med., 157 (2023), 106733. https://doi.org/10.1016/j.compbiomed.2023.106733 doi: 10.1016/j.compbiomed.2023.106733
    [45] R. Meng, S. Yin, J. Sun, H. Hu, Q Zhao, ScAAGA: Single cell data analysis framework using asymmetric autoencoder with gene attention, Comput. Biol. Med., 165 (2023), 107414. https://doi.org/10.1016/j.compbiomed.2023.107414 doi: 10.1016/j.compbiomed.2023.107414
    [46] H. Gao, J. Sun, Y. Wang, Y. Lu, L. Liu, Q. Zhao, et al., Predicting metabolite–disease associations based on auto-encoder and non-negative matrix factorization, Briefings Bioinf., 24 (2023), bbad259. https://doi.org/10.1093/bib/bbad259 doi: 10.1093/bib/bbad259
    [47] W. Wang, L. Zhang, J. Sun, Q. Zhao, J. Shuai, Predicting the potential human lncRNA–miRNA interactions based on graph convolution network with conditional random field, Briefings Bioinf., 23 (2022), bbac463. https://doi.org/10.1093/bib/bbac463 doi: 10.1093/bib/bbac463
    [48] L. Zhang, P. Yang, H. Feng, Q. Zhao, H. Liu, Using network distance analysis to predict lncRNA–miRNA interactions, Interdiscip. Sci. Comput. Life Sci., 13 (2021), 535–545. https://doi.org/10.1007/s12539-021-00458-z doi: 10.1007/s12539-021-00458-z
    [49] F. Sun, J. Sun, Q. Zhao, A deep learning method for predicting metabolite–disease associations via graph neural network, Briefings Bioinf., 23 (2022), bbac266. https://doi.org/10.1093/bib/bbac266 doi: 10.1093/bib/bbac266
    [50] T. Wang, J. Sun, Q. Zhao, Investigating cardiotoxicity related with hERG channel blockers using molecular fingerprints and graph attention mechanism, Comput. Biol. Med., 153 (2023), 106464. https://doi.org/10.1016/j.compbiomed.2022.106464 doi: 10.1016/j.compbiomed.2022.106464
    [51] Z. Chen, L. Zhang, J. Sun, R. Meng, S. Yin, Q. Zhao, DCAMCP: A deep learning model based on capsule network and attention mechanism for molecular carcinogenicity prediction, J. Cell Mol. Med., 27 (2023), 3117–3126. https://doi.org/10.1111/jcmm.17889 doi: 10.1111/jcmm.17889
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1059) PDF downloads(56) Cited by(0)

Article outline

Figures and Tables

Figures(5)  /  Tables(12)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog