Machine Learning Models in Type 2 Diabetes Risk Prediction: Results from a Cross-sectional Retrospective Study in Chinese Adults

Xiong, Xiao-lu; Zhang, Rong-xin; Bi, Yan; Zhou, Wei-hong; Yu, Yun; Zhu, Da-long

doi:10.1007/s11596-019-2077-4

Machine Learning Models in Type 2 Diabetes Risk Prediction: Results from a Cross-sectional Retrospective Study in Chinese Adults

Published: 25 July 2019

Volume 39, pages 582–588, (2019)
Cite this article

Current Medical Science Aims and scope Submit manuscript

Xiao-lu Xiong¹^na1,
Rong-xin Zhang²^na1,
Yan Bi¹,
Wei-hong Zhou¹,
Yun Yu² &
…
Da-long Zhu¹

34 Citations
Explore all metrics

Summary

Type 2 diabetes mellitus (T2DM) has become a prevalent health problem in China, especially in urban areas. Early prevention strategies are needed to reduce the associated mortality and morbidity. We applied the combination of rules and different machine learning techniques to assess the risk of development of T2DM in an urban Chinese adult population. A retrospective analysis was performed on 8000 people with non-diabetes and 3845 people with T2DM in Nanjing. Multilayer Perceptron (MLP), AdaBoost (AD), Trees Random Forest (TRF), Support Vector Machine (SVM), and Gradient Tree Boosting (GTB) machine learning techniques with 10 cross validation methods were used with the proposed model for the prediction of the risk of development of T2DM. The performance of these models was evaluated with accuracy, precision, sensitivity, specificity, and area under receiver operating characteristic (ROC) curve (AUC). After comparison, the prediction accuracy of the different five machine models was 0.87, 0.86, 0.86, 0.86 and 0.86 respectively. The combination model using the same voting weight of each component was built on T2DM, which was performed better than individual models. The findings indicate that, combining machine learning models could provide an accurate assessment model for T2DM risk prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Predictive models for diabetes mellitus using machine learning techniques

Article Open access 15 October 2019

Machine learning for characterizing risk of type 2 diabetes mellitus in a rural Chinese population: the Henan Rural Cohort Study

Article Open access 10 March 2020

Predictive Supervised Machine Learning Models for Diabetes Mellitus

Article 21 July 2020

References

Wang L, Gao P, Zhang M, et al. Prevalence and Ethnic Pattern of Diabetes and Prediabetes in China in 2013. JAMA, 2017, 317(24):2515–2523
Article PubMed PubMed Central Google Scholar
Yang W, Lu J, Weng J, et al. Prevalence of diabetes among men and women in China. N Engl J Med, 2010, 362(12):1090–1101
Article CAS PubMed Google Scholar
Xu Y, Wang L, He J, et al. Prevalence and control of diabetes in Chinese adults. JAMA, 2013,310(9):948–959
Article CAS PubMed Google Scholar
Pan XR, Yang WY, Li GW, et al. Prevalence of diabetes and its risk factors in China, 1994. National Diabetes Prevention and Control Cooperative Group. Diabetes Care, 1997, 20(11):1664–1669
Article CAS PubMed Google Scholar
Li G, Zhang P, Wang J, et al. The long-term effect of lifestyle interventions to prevent diabetes in the China Da Qing Diabetes Prevention Study: a 20-year follow-up study. Lancet, 2008, 371(9626):1783–1789
Article PubMed Google Scholar
Lindstrom J, Ilanne-Parikka P, Peltonen M, et al. Sustained reduction in the incidence of type 2 diabetes by lifestyle intervention: follow-up of the Finnish Diabetes Prevention Study. Lancet, 2006,368(9548):1673–1679
Google Scholar
Knowler WC, Barrett-Connor E, Fowler SE, et al. Reduction in the incidence of type 2 diabetes with lifestyle intervention or metformin. N Engl J Med, 2002, 346(6):393–403
Article CAS PubMed Google Scholar
Knowler WC, Fowler SE, Hamman RF, et al. 10-year follow-up of diabetes incidence and weight loss in the Diabetes Prevention Program Outcomes Study. Lancet, 2009, 374(9702):1677–1686
Article PubMed Google Scholar
Buijsse B, Simmons RK, Griffin SJ, et al. Risk assessment tools for identifying individuals at risk of developing type 2 diabetes. Epidemiol Rev, 2011, 33:46–62
Article PubMed PubMed Central Google Scholar
Thoopputra T, Newby D, Schneider J, et al. Survey of diabetes risk assessment tools: concepts, structure and performance. Diabetes Metab Res Rev, 2012, 28(6):485–498
Article PubMed Google Scholar
Abbasi A, Peelen LM, Corpeleijn E, et al. Prediction models for risk of developing type 2 diabetes: systematic literature search and independent external validation study. BMJ, 2012, 345:e5900
Article PubMed PubMed Central Google Scholar
Collins GS, Mallett S, Omar O, et al. Developing risk prediction models for type 2 diabetes: a systematic review of methodology and reporting. BMC Med, 2011, 9:103
Article PubMed PubMed Central Google Scholar
Noble D, Mathur R, Dent T, et al. Risk models and scores for type 2 diabetes: systematic review. BMJ, 2011, 343:d7163
Article PubMed PubMed Central Google Scholar
Yoo I, Alafaireet P, Marinov M, et al. Data mining in healthcare and biomedicine: a survey of the literature. J Med Syst, 2012, 36(4):2431–2448
Article PubMed Google Scholar
Barber SR, Davies MJ, Khunti K, et al. Risk assessment tools for detecting those with pre-diabetes: a systematic review. Diabetes Res Clin Pract, 2014, 105(1):1–13
Article PubMed Google Scholar
Shankaracharya, Odedra D, Samanta S, et al. Computational intelligence in early diabetes diagnosis: a review. Rev Diabet Stud, 2010, 7(4):252–262
Article CAS PubMed Google Scholar
Choi SB, Kim WJ, Yoo TK, et al. Screening for prediabetes using machine learning models. Comput Math Methods Med, 2014, 2014:618976
Wang C, Li L, Wang L, et al. Evaluating the risk of type 2 diabetes mellitus using artificial neural network: an effective classification approach. Diabetes Res Clin Pract, 2013, 100(1):111–118
Article Google Scholar
Mansour R, Eghbal Z, Amirhossein H. Comparison of Artificial Neural Network, Logistic Regression and Discriminant Analysis Efficiency in Determining Risk Factors of Type 2 Diabetes. World Appl Sci J, 2013, 23(11):1522–1529
Google Scholar
Meng XH, Huang YX, Rao DP, et al. Comparison of three data mining models for predicting diabetes or prediabetes by risk factors. Kaohsiung J Med Sci, 2013, 29(2):93–99
Article PubMed Google Scholar
Quinlan JR. Induction of decision trees. Machine Learning, 1986, 1(1):81–106
Google Scholar
Seni G, Elder J. Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions. USA: Morgan & Claypool Publishers. 2010.
Google Scholar
Patel P, Macerollo A. Diabetes mellitus: diagnosis and screening. Am Fam Physician. 2010, 81(7):863–870
PubMed Google Scholar
American Diabetes Association. 2. Classification and Diagnosis of Diabetes: Standards of Medical Care in Diabetes-2018. Diabetes Care, 2018, 1(Suppl 1):S13–S27
Article Google Scholar
Gardner MW, Dorling SR. Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmos Environ, 1998, 32(14–15):2627–2636
Article CAS Google Scholar
Ferreira AJ, Figueiredo MAT. Boosting Algorithms: A Review of Methods, Theory, and Applications. Ensemble Machine Learning, 2012:35–85
Chapter Google Scholar
Breiman L. Random Forests. Machine Learning, 2001, 45(1):5–32
Article Google Scholar
Nazari Z, Kang D. Density Based Support Vector Machines for Classification. IJARAI, 2015, 4(4):64–76
Article CAS Google Scholar
Gerstein HC, Yusuf S, Bosch J, et al. Effect of rosiglitazone on the frequency of diabetes in patients with impaired glucose tolerance or impaired fasting glucose: a randomised controlled trial. Lancet, 2006, 368(9541):1096–1105
Article CAS PubMed Google Scholar
Norris SL, Kansagara D, Bougatsos C, et al. Screening adults for type 2 diabetes: a review of the evidence for the U.S. Preventive Services Task Force. Ann Intern Med, 2008, 148(11):855–868
Article PubMed Google Scholar
Montazeri M, Nezamabadi-Pour H, editors. Automatic extraction of eye field from a gray intensity image using intensity filtering and hybrid projection function. International Conference on Communications, Computing and Control Applications. 2011.
Montazeri M, Nezamabadi-pour H, Montazeri M. Automatically Eye Detection with Different Gray Intensity Image Conditions. Computer Technol Appl, 2012 (8):525–532
Google Scholar
Mitra M, Bahrololoum A, Nezamabadi-Pour H, et al, editors. Cooperating of Local Searches based Hyperheuristic Approach for Solving Traveling Salesman Problem. Ijcci, 2011.
Hashemian AH, Beiranvand B, Rezaei M, et al. Comparison of Artificial Neural Networks and Cox Regression Models in Prediction of Kidney Transplant Survival. Neuropharmacology, 2012, 62(4):1717–1729
Article CAS Google Scholar
Bang H, Edwards AM, Bomback AS, et al. Development and Validation of a Patient Self-assessment Score for Diabetes Risk. Ann Intern Med, 2009, 151(11):775–783
Article PubMed PubMed Central Google Scholar
Lindström J, Tuomilehto J. The diabetes risk score: a practical tool to predict type 2 diabetes risk. Diabetes Care, 2003, 26(3):725–731
Article PubMed Google Scholar
Schulze MB, Hoffmann K, Boeing H, et al. An Accurate Risk Score Based on Anthropometric, Dietary, and Lifestyle Factors to Predict the Development of Type 2 Diabetes. Diabetes Care, 2007, 30(8):e89
Google Scholar
Glümer C, Carstensen B, Sandbæk A, et al. A Danish diabetes risk score for targeted screening: the Inter99 study. Diabetes Care, 2004, 27(3):727–733
Article PubMed Google Scholar
Kahn HS, Cheng YJ, Thompson TJ, et al. Two risk-scoring systems for predicting incident diabetes mellitus in U.S. adults age 45 to 64 years. Ann Intern Med, 2009, 150(11):741–751
Article PubMed Google Scholar
Ramachandran A, Snehalatha C, Vijay V, et al. Derivation and validation of diabetes risk score for urban Asian Indians. Diabetes Res Clin Pr, 2005, 70(1):63–70
Article CAS Google Scholar
Aekplakorn W, Bunnag P, Woodward M, et al. A Risk Score for Predicting Incident Diabetes in the Thai Population. Diabetes Care, 2006, 29(29):1872–1877
Article PubMed Google Scholar
Gao WG, Dong YH, Pang ZC, et al. A simple Chinese risk score for undiagnosed diabetes. Diabetic Med, 2010, 27(3):274–281
Article CAS PubMed Google Scholar
Glümer C, Vistisen D, Borchjohnsen K, et al. Risk Scores for Type 2 Diabetes Can Be Applied in Some Populations but Not All. Diabetes Care, 2006, 29(2):410–414
Article PubMed Google Scholar
Habibi S, Ahmadi M, Alizadeh S. Type 2 Diabetes Mellitus Screening and Risk Factors Using Decision Tree: Results of Data Mining. Glob J Health Sci, 2015, 7(5):304–310
Article PubMed PubMed Central Google Scholar

Download references

Author information

The authors contributed equally to this work.

Authors and Affiliations

Department of Endocrinology, Nanjing Drum Tower Hospital Clinical College of Nanjing Medical University, Nanjing, 210008, China
Xiao-lu Xiong, Yan Bi, Wei-hong Zhou & Da-long Zhu
School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, 211166, China
Rong-xin Zhang & Yun Yu

Authors

Xiao-lu Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Rong-xin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yan Bi
View author publications
You can also search for this author in PubMed Google Scholar
Wei-hong Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yun Yu
View author publications
You can also search for this author in PubMed Google Scholar
Da-long Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Wei-hong Zhou, Yun Yu or Da-long Zhu.

Additional information

This work was supported by grants from the National Natural Science Foundation of China (No. 81570737, No. 81370947, No. 81570736, No. 81770819, No. 81500612, No. 81400832, No. 81600637, No. 81600632, and No. 81703294), the National Key Research and Development Program of China (No. 2016YFC1304804 and No. 2017YFC1309605), the Jiangsu Provincial Key Medical Discipline (No. ZDXKB2016012), the Key Project of Nanjing Clinical Medical Science, the Key Research and Development Program of Jiangsu Province of China (No. BE2015604 and No. BE2016606), the Jiangsu Provincial Medical Talent (No. ZDRCA2016062), and the Nanjing Science and Technology Development Project (No. 201605019).

Conflict of Interest Statement

The authors declare that they have no conflicts of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiong, Xl., Zhang, Rx., Bi, Y. et al. Machine Learning Models in Type 2 Diabetes Risk Prediction: Results from a Cross-sectional Retrospective Study in Chinese Adults. CURR MED SCI 39, 582–588 (2019). https://doi.org/10.1007/s11596-019-2077-4

Download citation

Received: 17 July 2018
Revised: 10 June 2019
Published: 25 July 2019
Issue Date: August 2019
DOI: https://doi.org/10.1007/s11596-019-2077-4

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine Learning Models in Type 2 Diabetes Risk Prediction: Results from a Cross-sectional Retrospective Study in Chinese Adults

Summary

Access this article

Similar content being viewed by others

Predictive models for diabetes mellitus using machine learning techniques

Machine learning for characterizing risk of type 2 diabetes mellitus in a rural Chinese population: the Henan Rural Cohort Study

Predictive Supervised Machine Learning Models for Diabetes Mellitus

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Conflict of Interest Statement

Rights and permissions

About this article

Cite this article

Key words

Navigation

Machine Learning Models in Type 2 Diabetes Risk Prediction: Results from a Cross-sectional Retrospective Study in Chinese Adults

Summary

Access this article

Similar content being viewed by others

Predictive models for diabetes mellitus using machine learning techniques

Machine learning for characterizing risk of type 2 diabetes mellitus in a rural Chinese population: the Henan Rural Cohort Study

Predictive Supervised Machine Learning Models for Diabetes Mellitus

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Conflict of Interest Statement

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation