Discriminating insulin resistance in middle-aged nondiabetic women using machine learning approaches

Zailing Xing; Henian Chen; Amy C. Alman; Zailing Xing; Henian Chen; Amy C. Alman

doi:10.3934/publichealth.2024034

AIMS Public Health

2024, Volume 11, Issue 2: 667-687. doi: 10.3934/publichealth.2024034

Previous Article Next Article

Research article Topical Sections

Discriminating insulin resistance in middle-aged nondiabetic women using machine learning approaches

College of Public Health, University of South Florida, 13201 Bruce B. Downs Blvd, MDC 56, Tampa, FL 33612, USA

Received: 20 February 2024 Revised: 28 March 2024 Accepted: 08 April 2024 Published: 09 May 2024

Objective
We employed machine learning algorithms to discriminate insulin resistance (IR) in middle-aged nondiabetic women.

Methods
The data was from the National Health and Nutrition Examination Survey (2007–2018). The study subjects were 2084 nondiabetic women aged 45–64. The analysis included 48 predictors. We randomly divided the data into training (n = 1667) and testing (n = 417) datasets. Four machine learning techniques were employed to discriminate IR: extreme gradient boosting (XGBoosting), random forest (RF), gradient boosting machine (GBM), and decision tree (DT). The area under the curve (AUC) of receiver operating characteristic (ROC), accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score were compared as performance metrics to select the optimal technique.

Results
The XGBoosting algorithm achieved a relatively high AUC of 0.93 in the training dataset and 0.86 in the testing dataset to discriminate IR using 48 predictors and was followed by the RF, GBM, and DT models. After selecting the top five predictors to build models, the XGBoost algorithm with the AUC of 0.90 (training dataset) and 0.86 (testing dataset) remained the optimal prediction model. The SHapley Additive exPlanations (SHAP) values revealed the associations between the five predictors and IR, namely BMI (strongly positive impact on IR), fasting glucose (strongly positive), HDL-C (medium negative), triglycerides (medium positive), and glycohemoglobin (medium positive). The threshold values for identifying IR were 29 kg/m², 100 mg/dL, 54.5 mg/dL, 89 mg/dL, and 5.6% for BMI, glucose, HDL-C, triglycerides, and glycohemoglobin, respectively.

Conclusion
The XGBoosting algorithm demonstrated superior performance metrics for discriminating IR in middle-aged nondiabetic women, with BMI, glucose, HDL-C, glycohemoglobin, and triglycerides as the top five predictors.
- insulin resistance,
- machine learning,
- women,
- XGBoosting,
- HOMA-IR
Citation: Zailing Xing, Henian Chen, Amy C. Alman. Discriminating insulin resistance in middle-aged nondiabetic women using machine learning approaches[J]. AIMS Public Health, 2024, 11(2): 667-687. doi: 10.3934/publichealth.2024034

Related Papers:

Abstract

Objective

We employed machine learning algorithms to discriminate insulin resistance (IR) in middle-aged nondiabetic women.

Methods

The data was from the National Health and Nutrition Examination Survey (2007–2018). The study subjects were 2084 nondiabetic women aged 45–64. The analysis included 48 predictors. We randomly divided the data into training (n = 1667) and testing (n = 417) datasets. Four machine learning techniques were employed to discriminate IR: extreme gradient boosting (XGBoosting), random forest (RF), gradient boosting machine (GBM), and decision tree (DT). The area under the curve (AUC) of receiver operating characteristic (ROC), accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score were compared as performance metrics to select the optimal technique.

Results

The XGBoosting algorithm achieved a relatively high AUC of 0.93 in the training dataset and 0.86 in the testing dataset to discriminate IR using 48 predictors and was followed by the RF, GBM, and DT models. After selecting the top five predictors to build models, the XGBoost algorithm with the AUC of 0.90 (training dataset) and 0.86 (testing dataset) remained the optimal prediction model. The SHapley Additive exPlanations (SHAP) values revealed the associations between the five predictors and IR, namely BMI (strongly positive impact on IR), fasting glucose (strongly positive), HDL-C (medium negative), triglycerides (medium positive), and glycohemoglobin (medium positive). The threshold values for identifying IR were 29 kg/m², 100 mg/dL, 54.5 mg/dL, 89 mg/dL, and 5.6% for BMI, glucose, HDL-C, triglycerides, and glycohemoglobin, respectively.

Conclusion

The XGBoosting algorithm demonstrated superior performance metrics for discriminating IR in middle-aged nondiabetic women, with BMI, glucose, HDL-C, glycohemoglobin, and triglycerides as the top five predictors.

Acknowledgments

We appreciate the Centers for Disease Control and Prevention for providing the National Health and Nutritional Examination Survey data for public use.

Conflict of interest

All authors declare no conflicts of interest in this paper.

References

[1]	Sah SP, Singh B, Choudhary S, et al. (2016) Animal models of insulin resistance: A review. Pharmacol Rep 68: 1165-1177. https://doi.org/10.1016/j.pharep.2016.07.010
[2]	Cavaghan MK, Ehrmann DA, Polonsky KS (2000) Interactions between insulin resistance and insulin secretion in the development of glucose intolerance. J Clin Invest 106: 329-333. https://doi.org/10.1172/JCI10761
[3]	Haffner SM (2003) Insulin resistance, inflammation, and the prediabetic state. Am J Cardiol 92: 18-26. https://doi.org/10.1016/s0002-9149(03)00612-x
[4]	Guo S (2014) Insulin signaling, resistance, and the metabolic syndrome: insights from mouse models to disease mechanisms. J Endocrinol 220: 1-23. https://doi.org/10.1530/JOE-13-0327
[5]	Chitturi S, Abeygunasekera S, Farrell GC, et al. (2002) NASH and insulin resistance: insulin hypersecretion and specific association with the insulin resistance syndrome. Hepatology 35: 373-379. https://doi.org/10.1053/jhep.2002.30692
[6]	Dassie F, Favaretto F, Bettini S, et al. (2021) Alström syndrome: an ultra-rare monogenic disorder as a model for insulin resistance, type 2 diabetes mellitus and obesity. Endocrine 71: 618-625. https://doi.org/10.1007/s12020-021-02643-y
[7]	Muniyappa R, Lee S, Chen H, et al. (2008) Current approaches for assessing insulin sensitivity and resistance in vivo: advantages, limitations, and appropriate usage. Am J Physiol Endocrinol Metab 294: E15-26. https://doi.org/10.1152/ajpendo.00645.2007
[8]	Gutch M, Kumar S, Razi SM, et al. (2015) Assessment of insulin sensitivity/resistance. Indian J Endocrinol Metab 19: 160-164. https://doi.org/10.4103/2230-8210.146874
[9]	Khan MS, Cuda S, Karere GM, et al. (2022) Breath biomarkers of insulin resistance in pre-diabetic Hispanic adolescents with obesity. Sci Rep 12: 339. https://doi.org/10.1038/s41598-021-04072-3
[10]	Khan P, Kader MF, Islam SMR, et al. (2021) Machine learning and deep learning approaches for brain disease diagnosis: Principles and recent advances. IEEE Access 9: 37622-37655. https://doi.org/10.1109/ACCESS.2021.3062484
[11]	Zitnik M, Nguyen F, Wang B, et al. (2019) Machine learning for integrating data in biology and medicine: Principles, practice, and opportunities. Inf Fusion 50: 71-91. https://doi.org/doi: 10.1016/j.inffus.2018.09.012
[12]	Park S, Kim C, Wu X (2022) Development and validation of an insulin resistance predicting model using a machine-learning approach in a population-based cohort in Korea. Diagnostics 12: 212. https://doi.org/10.3390/diagnostics12010212
[13]	Lee CL, Liu WJ, Tsai SF (2022) Development and validation of an insulin resistance model for a population with chronic kidney disease using a machine learning approach. Nutrients 14: 2832. https://doi.org/10.3390/nu14142832
[14]	Zhang Q, Wan NJ (2022) Simple method to predict insulin resistance in children aged 6–12 years by using machine learning. Diabetes Metab Syndr Obes 15: 2963-2975. https://doi.org/10.2147/DMSO.S380772
[15]	Tramunt B, Smati S, Grandgeorge N, et al. (2020) Sex differences in metabolic regulation and diabetes susceptibility. Diabetologia 63: 453-461. https://doi.org/10.1007/s00125-019-05040-3
[16]	Ciarambino T, Crispino P, Guarisco G (2023) Gender differences in insulin resistance: new knowledge and perspectives. Curr Issues Mol Biol 45: 7845-7861. https://doi.org/10.3390/cimb45100496
[17]	Shin HJ, Lee HS, Kwon YJ (2020) Association between reproductive years and insulin resistance in middle-aged and older women: A 10-year prospective cohort study. Maturitas 142: 31-37. https://doi.org/10.1016/j.maturitas.2020.07.004
[18]	Mohsen F, Al-Absi HRH, Yousri NA, et al. (2023) A scoping review of artificial intelligence-based methods for diabetes risk prediction. NPJ Digit Med 6: 197. https://doi.org/10.1038/s41746-023-00933-5
[19]	Koenig W, Sund M, Fröhlich M, et al. (1999) C-Reactive protein, a sensitive marker of inflammation, predicts future risk of coronary heart disease in initially healthy middle-aged men: results from the MONICA (Monitoring Trends and Determinants in Cardiovascular Disease) Augsburg Cohort Study, 1984 to 1992. Circulation 99: 237-242. https://doi.org/10.1161/01.cir.99.2.237
[20]	Dev R, Bruera E, Dalal S (2018) Insulin resistance and body composition in cancer patients. Ann Oncol 29: ii18-26. https://doi.org/10.1093/annonc/mdx815
[21]	Matthews DR, Hosker J, Rudenski A, et al. (1985) Homeostasis model assessment: insulin resistance and β-cell function from fasting plasma glucose and insulin concentrations in man. Diabetologia 28: 412-419. https://doi.org/10.1007/BF00280883
[22]	Sumner AE, Cowie CC (2008) Ethnic differences in the ability of triglyceride levels to identify insulin resistance. Atherosclerosis 196: 696-703. https://doi.org/10.1016/j.atherosclerosis.2006.12.018
[23]	Xing Z, Alman AC, Kirby RS (2022) Parity and risk of cardiovascular disease in women over 45 years in the United States: National Health and Nutrition Examination Survey 2007–2018. J Womens Health 31: 1459-1466. https://doi.org/10.1089/jwh.2021.0650
[24]	Cao J, Qiu W, Lin Y, et al. (2023) Appropriate sleep duration modifying the association of insulin resistance and hepatic steatosis is varied in different status of metabolic disturbances among adults from the United States, NHANES 2017-March 2020. Prev Med Rep 36: 102406. https://doi.org/10.1016/j.pmedr.2023.102406
[25]	Levey AS, Stevens LA, Schmid CH, et al. (2009) A new equation to estimate glomerular filtration rate. Ann Intern Med 150: 604-612. https://doi.org/10.7326/0003-4819-150-9-200905050-00006
[26]	Du R, Tsougenis ED, Ho JW, et al. (2021) Machine learning application for the prediction of SARS-CoV-2 infection using blood tests and chest radiograph. Sci Rep 11: 14250. https://doi.org/10.1038/s41598-021-93719-2
[27]	Lopez-Arevalo I, Aldana-Bobadilla E, Molina-Villegas A, et al. (2020) A memory-efficient encoding method for processing mixed-type data on machine learning. Entropy 22: 1391. https://doi.org/10.3390/e22121391
[28]	Oka M (2021) Interpreting a standardized and normalized measure of neighborhood socioeconomic status for a better understanding of health differences. Arch Public Health 79: 226. https://doi.org/10.1186/s13690-021-00750-w
[29]	Hassanzadeh R, Farhadian M, Rafieemehr H (2023) Hospital mortality prediction in traumatic injuries patients: comparing different SMOTE-based machine learning algorithms. BMC Med Res Methodol 23: 101. https://doi.org/10.1186/s12874-023-01920-w
[30]	Grewal R, Cote JA, Baumgartner H (2004) Multicollinearity and Measurement Error in Structural Equation Models: Implications for Theory Testing. Mark Sci 23: 519-529. https://doi.org/10.1287/mksc.1040.0070
[31]	Yadav DC, Pal S (2020) Prediction of heart disease using feature selection and random forest ensemble method. Int J Pharm Res 12: 56-66. https://doi.org/10.31838/ijpr/2020.12.04.013
[32]	Ali ZA, Abduljabbar ZH, Taher HA, et al. (2023) Exploring the Power of eXtreme Gradient Boosting Algorithm in Machine Learning: a Review. Nawroz Univ J 12: 320-334. https://doi.org/10.1186/s12873-024-00939-6
[33]	Konstantinov AV, Utkin LV (2021) Interpretable machine learning with an ensemble of gradient boosting machines. Knowl Based Syst 222: 106993. https://doi.org/10.1016/j.knosys.2021.106993
[34]	Arabameri A, Chandra Pal S, Rezaie F, et al. (2022) Decision tree based ensemble machine learning approaches for landslide susceptibility mapping. Geocarto Int 37: 4594-4627. https://doi.org/10.1080/10106049.2021.1892210
[35]	Liu Y, Qiu T, Hu H, et al. (2023) Machine Learning models for prediction of severe pneumocystis carinii pneumonia after kidney transplantation: A single-center retrospective study. Diagnostics 13: 2735. https://doi.org/10.3390/diagnostics13172735
[36]	Liu YX, Liu X, Cen C, et al. (2021) Comparison and development of advanced machine learning tools to predict nonalcoholic fatty liver disease: An extended study. Hepatobiliary Pancreat Dis Int 20: 409-415. https://doi.org/10.1016/j.hbpd.2021.08.004
[37]	Wang X, Ahmad I, Javeed D, et al. (2022) Intelligent hybrid deep learning model for breast cancer detection. Electronics 11: 2767. https://doi.org/10.3390/electronics11172767
[38]	Liu Y, Qiu T, Hu H, et al. (2023) Machine learning models for prediction of severe pneumocystis carinii pneumonia after kidney transplantation: A single-center retrospective study. Diagnostics 13: 2735. https://doi.org/10.3390/diagnostics13172735
[39]	Tsui A, Tudosiu P-D, Brudfors M, et al. (2023) Predicting mortality in acutely hospitalised older patients: the impact of model dimensionality. BMC Med 21: 10. https://doi.org/10.1186/s12916-022-02698-2
[40]	Kavzoglu T, Teke A (2022) Predictive performances of ensemble machine learning algorithms in landslide susceptibility mapping using random forest, extreme gradient boosting (XGBoost) and natural gradient boosting (NGBoost). Arab J Sci Eng 47: 7367-7385. https://doi.org/10.1007/s13369-022-06560-8
[41]	Dalal S, Onyema EM, Malik A (2022) Hybrid XGBoost model with hyperparameter tuning for prediction of liver disease with better accuracy. World J Gastroenterol 28: 6551-6563. https://doi.org/10.3748/wjg.v28.i46.6551
[42]	Stern SE, Williams K, Ferrannini E, et al. (2005) Identification of individuals with insulin resistance using routine clinical measurements. Diabetes 54: 333-339. https://doi.org/10.2337/diabetes.54.2.333
[43]	Kurniawan LB, Bahrun U, Hatta M, et al. (2018) Body mass, total body fat percentage, and visceral fat level predict insulin resistance better than waist circumference and body mass index in healthy young male adults in Indonesia. J Clin Med 7: 96. https://doi.org/10.3390/jcm7050096
[44]	Duca LM, Maahs DM, Schauer IE, et al. (2016) Development and validation of a method to estimate insulin sensitivity in patients with and without type 1 diabetes. J Clin Endocrinol Metab 101: 686-695. https://doi.org/10.1210/jc.2015-3272
[45]	Dabelea D, D'agostino R, Mason C, et al. (2011) Development, validation and use of an insulin sensitivity score in youths with diabetes: the SEARCH for Diabetes in Youth study. Diabetologia 54: 78-86. https://doi.org/10.1007/s00125-010-1911-9
[46]	Ärnlöv J, Sundström J, Ingelsson E, et al. (2011) Impact of BMI and the metabolic syndrome on the risk of diabetes in middle-aged men. Diabetes Care 34: 61-65. https://doi.org/10.2337/dc10-0955
[47]	Kobo O, Leiba R, Avizohar O, et al. (2019) Normal body mass index (BMI) can rule out metabolic syndrome: An Israeli cohort study. Medicine 98: e14712. https://doi.org/10.1097/MD.0000000000014712
[48]	Kahn BB, Flier JS (2000) Obesity and insulin resistance. J Clin Invest 106: 473-481. https://doi.org/10.1172/JCI10842
[49]	Qatanani M, Lazar MA (2007) Mechanisms of obesity-associated insulin resistance: many choices on the menu. Genes Dev 21: 1443-1455. https://doi.org/10.1101/gad.1550907
[50]	Antuna-Puente B, Feve B, Fellahi S, et al. (2008) Adipokines: the missing link between insulin resistance and obesity. Diabetes Meta 34: 2-11. https://doi.org/10.1016/j.diabet.2007.09.004
[51]	Alberti KGM, Zimmet P, Shaw J (2005) The metabolic syndrome—a new worldwide definition. Lancet 366: 1059-1062. https://doi.org/10.1016/S0140-6736(05)67402-8
[52]	Lorenzo C, Wagenknecht LE, Hanley AJ, et al. (2010) A1C between 5.7 and 6.4% as a marker for identifying pre-diabetes, insulin sensitivity and secretion, and cardiovascular risk factors: the Insulin Resistance Atherosclerosis Study (IRAS). Diabetes Care 33: 2104-2109. https://doi.org/doi: 10.2337/dc10-0679
[53]	American Diabetes Association Professional Practice Committee.Classification and diagnosis of diabetes: Standards of Medical Care in Diabetes—2022. Diabetes Care (2022) 45: S17-38. https://doi.org/10.2337/dc22-S002
[54]	Grundy SM, Cleeman JI, Daniels SR, et al. (2005) Diagnosis and management of the metabolic syndrome: an American Heart Association/National Heart, Lung, and Blood Institute scientific statement. Circulation 112: 2735-2752. https://doi.org/10.1161/CIRCULATIONAHA.105.169404
[55]	Chakradar M, Aggarwal A, Cheng X, et al. (2021) A non-invasive approach to identify insulin resistance with triglycerides and HDL-c ratio using machine learning. Neural Process Let : 1-21. https://doi.org/10.1007/s11063-021-10461-6
[56]	Osler M, Daugbjerg S, Frederiksen BL, et al. (2011) Body mass and risk of complications after hysterectomy on benign indications. Hum Reprod 26: 1512-1518. https://doi.org/10.1093/humrep/der060
[57]	Wolongevicz DM, Zhu L, Pencina MJ, et al. (2010) Diet quality and obesity in women: the Framingham Nutrition Studies. Br J Nutr 103: 1223-1229. https://doi.org/10.1017/S0007114509992893
[58]	Reynolds R, Osmond C, Phillips D, et al. (2010) Maternal BMI, parity, and pregnancy weight gain: influences on offspring adiposity in young adulthood. J Clin Endocrinol Metab 95: 5365-5369. https://doi.org/10.1210/jc.2010-0697
[59]	Rajput D, Wang WJ, Chen CC (2023) Evaluation of a decided sample size in machine learning applications. BMC Bioinformatics 24: 48. https://doi.org/10.1186/s12859-023-05156-9
[60]	Moghaddam DD, Rahmati O, Panahi M, et al. (2020) The effect of sample size on different machine learning models for groundwater potential mapping in mountain bedrock aquifers. Catena 187: 104421. https://doi.org/10.1016/j.catena.2019.104421

publichealth-11-02-034-s001.pdf

Reader Comments

Your name:*

Email:*
© 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)