Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2019

Open Access 01-12-2019 | Stroke | Research article

Accurate and rapid screening model for potential diabetes mellitus

Authors: Dongmei Pei, Yang Gong, Hong Kang, Chengpu Zhang, Qiyong Guo

Published in: BMC Medical Informatics and Decision Making | Issue 1/2019

Login to get access

Abstract

Background

Prediction or early diagnosis of diabetes is crucial for populations with high risk of diabetes.

Methods

In this study, we assessed the ability of five popular classifiers (J48, AdaboostM1, SMO, Bayes Net, and Naïve Bayes) to identify individuals with diabetes based on nine non-invasive and easily obtained clinical features, including age, gender, body mass index (BMI), hypertension, history of cardiovascular disease or stroke, family history of diabetes, physical activity, work stress, and salty food preference. A total of 4205 data entries were obtained from annual physical examination reports for adults in the Shengjing Hospital of China Medical University during January–April 2017. Weka data mining software was used to identify the best algorithm for diabetes classification.

Results

The results indicate that decision tree classifier J48 has the best performance (accuracy = 0.9503, precision = 0.950, recall = 0.950, F-measure = 0.948, and AUC = 0.964). The decision tree structure shows that age is the most significant feature, followed by family history of diabetes, work stress, BMI, salty food preference, physical activity, hypertension, gender, and history of cardiovascular disease or stroke.

Conclusions

Our study shows that decision tree analyses can be applied to screen individuals for early diabetes risk without the need for invasive tests. This procedure will be particularly useful in developing regions with high epidemiological risk and poor socioeconomic status, and enable clinical practitioners to rapidly screen patients for increased risk of diabetes. The key features in the tree structure could further facilitate diabetes prevention through targeted community interventions, which can potentially improve early diabetes diagnosis and reduce burdens on the healthcare system.
Literature
2.
go back to reference Hadaegh F, et al. High prevalence of undiagnosed diabetes and abnormal glucose tolerance in the Iranian urban population: Tehran lipid and glucose study. BMC Public Health. 2008;8:176.CrossRef Hadaegh F, et al. High prevalence of undiagnosed diabetes and abnormal glucose tolerance in the Iranian urban population: Tehran lipid and glucose study. BMC Public Health. 2008;8:176.CrossRef
3.
go back to reference Jahani M, Mahdavi M. Comparison of predictive models for the early diagnosis of diabetes. Healthc Inform Res. 2016;22(2):95–100.CrossRef Jahani M, Mahdavi M. Comparison of predictive models for the early diagnosis of diabetes. Healthc Inform Res. 2016;22(2):95–100.CrossRef
4.
go back to reference Pippitt K, Li M, Gurgle HE. Diabetes mellitus: screening and diagnosis. Am Fam Physician. 2016;93(2):103–9.PubMed Pippitt K, Li M, Gurgle HE. Diabetes mellitus: screening and diagnosis. Am Fam Physician. 2016;93(2):103–9.PubMed
6.
go back to reference Lelis VM, Guzman E, Belmonte MV. A statistical classifier to support diagnose meningitis in less developed areas of Brazil. J Med Syst. 2017;41(9):145.CrossRef Lelis VM, Guzman E, Belmonte MV. A statistical classifier to support diagnose meningitis in less developed areas of Brazil. J Med Syst. 2017;41(9):145.CrossRef
7.
go back to reference World Health Organization, 2008–2013 action plan for the global strategy for the prevention and control of non-communicable disease. World Health Organization, 2008–2013 action plan for the global strategy for the prevention and control of non-communicable disease.
8.
9.
go back to reference Olivera AR, et al. Comparison of machine-learning algorithms to build a predictive model for detecting undiagnosed diabetes - ELSA-Brasil: accuracy study. Sao Paulo Med J. 2017;135(3):234–46.CrossRef Olivera AR, et al. Comparison of machine-learning algorithms to build a predictive model for detecting undiagnosed diabetes - ELSA-Brasil: accuracy study. Sao Paulo Med J. 2017;135(3):234–46.CrossRef
10.
go back to reference Habibi S, Ahmadi M, Alizadeh S. Type 2 diabetes mellitus screening and risk factors using decision tree: results of data mining. Glob J Health Sci. 2015;7(5):304–10.CrossRef Habibi S, Ahmadi M, Alizadeh S. Type 2 diabetes mellitus screening and risk factors using decision tree: results of data mining. Glob J Health Sci. 2015;7(5):304–10.CrossRef
11.
go back to reference Huang ML, Chen HY. Glaucoma classification model based on GDx VCC measured parameters by decision tree. J Med Syst. 2010;34(6):1141–7.CrossRef Huang ML, Chen HY. Glaucoma classification model based on GDx VCC measured parameters by decision tree. J Med Syst. 2010;34(6):1141–7.CrossRef
12.
go back to reference Farion K, et al. A tree-based decision model to support prediction of the severity of asthma exacerbations in children. J Med Syst. 2010;34(4):551–62.CrossRef Farion K, et al. A tree-based decision model to support prediction of the severity of asthma exacerbations in children. J Med Syst. 2010;34(4):551–62.CrossRef
13.
go back to reference Gregori D, et al. Non-invasive risk stratification of coronary artery disease: an evaluation of some commonly used statistical classifiers in terms of predictive accuracy and clinical usefulness. J Eval Clin Pract. 2009;15(5):777–81.CrossRef Gregori D, et al. Non-invasive risk stratification of coronary artery disease: an evaluation of some commonly used statistical classifiers in terms of predictive accuracy and clinical usefulness. J Eval Clin Pract. 2009;15(5):777–81.CrossRef
14.
go back to reference Chao CM, et al. Construction the model on the breast cancer survival analysis use support vector machine, logistic regression and decision tree. J Med Syst. 2014;38(10):106.CrossRef Chao CM, et al. Construction the model on the breast cancer survival analysis use support vector machine, logistic regression and decision tree. J Med Syst. 2014;38(10):106.CrossRef
15.
go back to reference Kate RJ, Nadig R. Stage-specific predictive models for breast cancer survivability. Int J Med Inform. 2017;97:304–11.CrossRef Kate RJ, Nadig R. Stage-specific predictive models for breast cancer survivability. Int J Med Inform. 2017;97:304–11.CrossRef
16.
go back to reference Takada M, et al. Prediction of axillary lymph node metastasis in primary breast cancer patients using a decision tree-based model. BMC Med Inform Decis Mak. 2012;12:54.CrossRef Takada M, et al. Prediction of axillary lymph node metastasis in primary breast cancer patients using a decision tree-based model. BMC Med Inform Decis Mak. 2012;12:54.CrossRef
17.
go back to reference Cakir A, Demirel B. A software tool for determination of breast cancer treatment methods using data mining approach. J Med Syst. 2011;35(6):1503–11.CrossRef Cakir A, Demirel B. A software tool for determination of breast cancer treatment methods using data mining approach. J Med Syst. 2011;35(6):1503–11.CrossRef
18.
go back to reference Saybani MR, et al. Diagnosing tuberculosis with a novel support vector machine-based artificial immune recognition system. Iran Red Crescent Med J. 2015;17(4):e24557.CrossRef Saybani MR, et al. Diagnosing tuberculosis with a novel support vector machine-based artificial immune recognition system. Iran Red Crescent Med J. 2015;17(4):e24557.CrossRef
19.
go back to reference Zmiri D, Shahar Y, Taieb-Maimon M. Classification of patients by severity grades during triage in the emergency department using data mining methods. J Eval Clin Pract. 2012;18(2):378–88.CrossRef Zmiri D, Shahar Y, Taieb-Maimon M. Classification of patients by severity grades during triage in the emergency department using data mining methods. J Eval Clin Pract. 2012;18(2):378–88.CrossRef
20.
go back to reference Meng X-H, et al. Comparison of three data mining models for predicting diabetes or prediabetes by risk factors. Kaohsiung J Med Sci. 2013;29(2):93–9.CrossRef Meng X-H, et al. Comparison of three data mining models for predicting diabetes or prediabetes by risk factors. Kaohsiung J Med Sci. 2013;29(2):93–9.CrossRef
21.
go back to reference Ramezankhani A, et al. Applying decision tree for identification of a low risk population for type 2 diabetes. Tehran lipid and glucose study. Diabetes Res Clin Pract. 2014;105(3):391–8.CrossRef Ramezankhani A, et al. Applying decision tree for identification of a low risk population for type 2 diabetes. Tehran lipid and glucose study. Diabetes Res Clin Pract. 2014;105(3):391–8.CrossRef
22.
go back to reference Yu W, et al. Application of support vector machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes. BMC Med Inform Decis Mak. 2010;10:16.CrossRef Yu W, et al. Application of support vector machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes. BMC Med Inform Decis Mak. 2010;10:16.CrossRef
23.
go back to reference Boyce S, et al. Evaluation of prediction models for the staging of prostate cancer. BMC Med Inform Decis Mak. 2013;13:126.CrossRef Boyce S, et al. Evaluation of prediction models for the staging of prostate cancer. BMC Med Inform Decis Mak. 2013;13:126.CrossRef
24.
go back to reference Kammerer JS, et al. Tuberculosis transmission in nontraditional settings: a decision-tree approach. Am J Prev Med. 2005;28(2):201–7.CrossRef Kammerer JS, et al. Tuberculosis transmission in nontraditional settings: a decision-tree approach. Am J Prev Med. 2005;28(2):201–7.CrossRef
25.
go back to reference Tayefi M, et al. The application of a decision tree to establish the parameters associated with hypertension. Comput Methods Prog Biomed. 2017;139:83–91.CrossRef Tayefi M, et al. The application of a decision tree to establish the parameters associated with hypertension. Comput Methods Prog Biomed. 2017;139:83–91.CrossRef
26.
go back to reference Alickovic E, Subasi A. Medical decision support system for diagnosis of heart arrhythmia using DWT and random forests classifier. J Med Syst. 2016;40(4):108.CrossRef Alickovic E, Subasi A. Medical decision support system for diagnosis of heart arrhythmia using DWT and random forests classifier. J Med Syst. 2016;40(4):108.CrossRef
27.
go back to reference Hu FB, et al. Diet, lifestyle, and the risk of type 2 diabetes mellitus in women. N Engl J Med. 2001;345(11):790–7.CrossRef Hu FB, et al. Diet, lifestyle, and the risk of type 2 diabetes mellitus in women. N Engl J Med. 2001;345(11):790–7.CrossRef
28.
go back to reference Anderson JW, et al. Carbohydrate and fiber recommendations for individuals with diabetes: a quantitative assessment and meta-analysis of the evidence. J Am Coll Nutr. 2004;23(1):5–17.CrossRef Anderson JW, et al. Carbohydrate and fiber recommendations for individuals with diabetes: a quantitative assessment and meta-analysis of the evidence. J Am Coll Nutr. 2004;23(1):5–17.CrossRef
29.
go back to reference Colditz GA, et al. Weight gain as a risk factor for clinical diabetes mellitus in women. Ann Intern Med. 1995;122(7):481–6.CrossRef Colditz GA, et al. Weight gain as a risk factor for clinical diabetes mellitus in women. Ann Intern Med. 1995;122(7):481–6.CrossRef
30.
go back to reference Koppes LL, et al. Moderate alcohol consumption lowers the risk of type 2 diabetes: a meta-analysis of prospective observational studies. Diabetes Care. 2005;28(3):719–25.CrossRef Koppes LL, et al. Moderate alcohol consumption lowers the risk of type 2 diabetes: a meta-analysis of prospective observational studies. Diabetes Care. 2005;28(3):719–25.CrossRef
31.
go back to reference Li CP, et al. Performance comparison between logistic regression, decision trees, and multilayer perceptron in predicting peripheral neuropathy in type 2 diabetes mellitus. Chin Med J. 2012;125(5):851–7.PubMed Li CP, et al. Performance comparison between logistic regression, decision trees, and multilayer perceptron in predicting peripheral neuropathy in type 2 diabetes mellitus. Chin Med J. 2012;125(5):851–7.PubMed
32.
go back to reference Lavrac N. Selected techniques for data mining in medicine. Artif Intell Med. 1999;16(1):3–23.CrossRef Lavrac N. Selected techniques for data mining in medicine. Artif Intell Med. 1999;16(1):3–23.CrossRef
33.
go back to reference Bellazzi R, Zupan B. Predictive data mining in clinical medicine: current issues and guidelines. Int J Med Inform. 2008;77(2):81–97.CrossRef Bellazzi R, Zupan B. Predictive data mining in clinical medicine: current issues and guidelines. Int J Med Inform. 2008;77(2):81–97.CrossRef
34.
go back to reference Marinov M, et al. Data-mining technologies for diabetes: a systematic review. J Diabetes Sci Technol. 2011;5(6):1549–56.CrossRef Marinov M, et al. Data-mining technologies for diabetes: a systematic review. J Diabetes Sci Technol. 2011;5(6):1549–56.CrossRef
35.
go back to reference Ravikumar P, et al. Prevalence and risk factors of diabetes in a community-based study in North India: the Chandigarh urban diabetes study (CUDS). Diabetes Metab. 2011;37(3):216–21.CrossRef Ravikumar P, et al. Prevalence and risk factors of diabetes in a community-based study in North India: the Chandigarh urban diabetes study (CUDS). Diabetes Metab. 2011;37(3):216–21.CrossRef
36.
go back to reference Reis JP, et al. Lifestyle factors and risk for new-onset diabetes: a population-based cohort study. Ann Intern Med. 2011;155(5):292–9.CrossRef Reis JP, et al. Lifestyle factors and risk for new-onset diabetes: a population-based cohort study. Ann Intern Med. 2011;155(5):292–9.CrossRef
37.
go back to reference Li G, et al. The long-term effect of lifestyle interventions to prevent diabetes in the China Da Qing diabetes prevention study: a 20-year follow-up study. Lancet. 2008;371(9626):1783–9.CrossRef Li G, et al. The long-term effect of lifestyle interventions to prevent diabetes in the China Da Qing diabetes prevention study: a 20-year follow-up study. Lancet. 2008;371(9626):1783–9.CrossRef
38.
go back to reference Saaristo T, et al. Lifestyle intervention for prevention of type 2 diabetes in primary health care: one-year follow-up of the Finnish National Diabetes Prevention Program (FIN-D2D). Diabetes Care. 2010;33(10):2146–51.CrossRef Saaristo T, et al. Lifestyle intervention for prevention of type 2 diabetes in primary health care: one-year follow-up of the Finnish National Diabetes Prevention Program (FIN-D2D). Diabetes Care. 2010;33(10):2146–51.CrossRef
39.
go back to reference Tuomilehto J, et al. Prevention of type 2 diabetes mellitus by changes in lifestyle among subjects with impaired glucose tolerance. N Engl J Med. 2001;344(18):1343–50.CrossRef Tuomilehto J, et al. Prevention of type 2 diabetes mellitus by changes in lifestyle among subjects with impaired glucose tolerance. N Engl J Med. 2001;344(18):1343–50.CrossRef
40.
go back to reference Anderson AE, et al. Electronic health record phenotyping improves detection and screening of type 2 diabetes in the general United States population: a cross-sectional, unselected, retrospective study. J Biomed Inform. 2016;60:162–8.CrossRef Anderson AE, et al. Electronic health record phenotyping improves detection and screening of type 2 diabetes in the general United States population: a cross-sectional, unselected, retrospective study. J Biomed Inform. 2016;60:162–8.CrossRef
41.
go back to reference Devoe JE, et al. Electronic health records vs Medicaid claims: completeness of diabetes preventive care data in community health centers. Ann Fam Med. 2011;9(4):351–8.CrossRef Devoe JE, et al. Electronic health records vs Medicaid claims: completeness of diabetes preventive care data in community health centers. Ann Fam Med. 2011;9(4):351–8.CrossRef
42.
go back to reference Heydari, M, et al. Comparison of various classification algorithms in the diagnosis of type 2 diabetes in Iran. Int J Diabetes Dev C. 2016;36(2):167–73.CrossRef Heydari, M, et al. Comparison of various classification algorithms in the diagnosis of type 2 diabetes in Iran. Int J Diabetes Dev C. 2016;36(2):167–73.CrossRef
Metadata
Title
Accurate and rapid screening model for potential diabetes mellitus
Authors
Dongmei Pei
Yang Gong
Hong Kang
Chengpu Zhang
Qiyong Guo
Publication date
01-12-2019
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2019
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/s12911-019-0790-3

Other articles of this Issue 1/2019

BMC Medical Informatics and Decision Making 1/2019 Go to the issue