Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2020

Open Access 01-12-2020 | Polyneuropathy | Research article

Identification of risk factors for patients with diabetes: diabetic polyneuropathy case study

Authors: Oleg Metsker, Kirill Magoev, Alexey Yakovlev, Stanislav Yanishevskiy, Georgy Kopanitsa, Sergey Kovalchuk, Valeria V. Krzhizhanovskaya

Published in: BMC Medical Informatics and Decision Making | Issue 1/2020

Login to get access

Abstract

Background

Methods of data mining and analytics can be efficiently applied in medicine to develop models that use patient-specific data to predict the development of diabetic polyneuropathy. However, there is room for improvement in the accuracy of predictive models. Existing studies of diabetes polyneuropathy considered a limited number of predictors in one study to enable a comparison of efficiency of different machine learning methods with different predictors to find the most efficient one. The purpose of this study is the implementation of machine learning methods for identifying the risk of diabetes polyneuropathy based on structured electronic medical records collected in databases of medical information systems.

Methods

For the purposes of our study, we developed a structured procedure for predictive modelling, which includes data extraction and preprocessing, model adjustment and performance assessment, selection of the best models and interpretation of results. The dataset contained a total number of 238,590 laboratory records. Each record 27 laboratory tests, age, gender and presence of retinopathy or nephropathy). The records included information about 5846 patients with diabetes. Diagnosis served as a source of information about the target class values for classification.

Results

It was discovered that inclusion of two expressions, namely “nephropathy” and “retinopathy” allows to increase the performance, achieving up to 79.82% precision, 81.52% recall, 80.64% F1 score, 82.61% accuracy, and 89.88% AUC using the neural network classifier. Additionally, different models showed different results in terms of interpretation significance: random forest confirmed that the most important risk factor for polyneuropathy is the increased neutrophil level, meaning the presence of inflammation in the body. Linear models showed linear dependencies of the presence of polyneuropathy on blood glucose levels, which is confirmed by the clinical interpretation of the importance of blood glucose control.

Conclusion

Depending on whether one needs to identify pathophysiological mechanisms for one’s prospective study or identify early or late predictors, the choice of model will vary. In comparison with the previous studies, our research makes a comprehensive comparison of different decisions using a large and well-structured dataset applied to different decision support tasks.
Appendix
Available only for authorised users
Literature
1.
go back to reference Izenberg A, Perkins BA, Bril V. Diabetic neuropathies. Semin Neurol. 2015;35(4):424–30.CrossRef Izenberg A, Perkins BA, Bril V. Diabetic neuropathies. Semin Neurol. 2015;35(4):424–30.CrossRef
2.
go back to reference Zilliox L, Russell JW. Treatment of diabetic sensory polyneuropathy. Curr Treatment Options Neurol. 2011;13(2):143–59.CrossRef Zilliox L, Russell JW. Treatment of diabetic sensory polyneuropathy. Curr Treatment Options Neurol. 2011;13(2):143–59.CrossRef
3.
go back to reference Wiggin TD, Sullivan KA, Pop-Busui R, Amato A, Sima AAF, Feldman EL. Elevated triglycerides correlate with progression of diabetic neuropathy. Diabetes. 2009;58(7):1634–40.CrossRef Wiggin TD, Sullivan KA, Pop-Busui R, Amato A, Sima AAF, Feldman EL. Elevated triglycerides correlate with progression of diabetic neuropathy. Diabetes. 2009;58(7):1634–40.CrossRef
4.
go back to reference Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I. Machine learning and data mining methods in diabetes research. Comput Struct Biotechnol J. 2017;15:104–16.CrossRef Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I. Machine learning and data mining methods in diabetes research. Comput Struct Biotechnol J. 2017;15:104–16.CrossRef
5.
go back to reference Zou Q, Qu K, Luo Y, Yin D, Ju Y, Tang H. Predicting diabetes mellitus with machine learning techniques. Front Genet. 2018;9:515.CrossRef Zou Q, Qu K, Luo Y, Yin D, Ju Y, Tang H. Predicting diabetes mellitus with machine learning techniques. Front Genet. 2018;9:515.CrossRef
6.
go back to reference Razavian N, Blecker S, Schmidt AM, Smith-McLallen A, Nigam S, Sontag D. Population-level prediction of type 2 diabetes from claims data and analysis of risk factors. Big Data. 2015;3(4):277–87.CrossRef Razavian N, Blecker S, Schmidt AM, Smith-McLallen A, Nigam S, Sontag D. Population-level prediction of type 2 diabetes from claims data and analysis of risk factors. Big Data. 2015;3(4):277–87.CrossRef
7.
go back to reference Oh W, et al. Type 2 diabetes mellitus trajectories and associated risks. Big Data. 2016;4(1):25–30.CrossRef Oh W, et al. Type 2 diabetes mellitus trajectories and associated risks. Big Data. 2016;4(1):25–30.CrossRef
8.
go back to reference Zhang X. Support vector machines. In: Encyclopedia of machine learning and data mining. Boston, MA: Springer US; 2017. p. 1214–20.CrossRef Zhang X. Support vector machines. In: Encyclopedia of machine learning and data mining. Boston, MA: Springer US; 2017. p. 1214–20.CrossRef
9.
go back to reference Artificial Neural Networks. Encyclopedia of machine learning and data mining. Boston: Springer US; 2017. p. 65–6. Artificial Neural Networks. Encyclopedia of machine learning and data mining. Boston: Springer US; 2017. p. 65–6.
10.
go back to reference Fürnkranz J. Decision tree. In: Encyclopedia of machine learning and data mining. Boston: Springer US; 2017. p. 330–5. Fürnkranz J. Decision tree. In: Encyclopedia of machine learning and data mining. Boston: Springer US; 2017. p. 330–5.
11.
go back to reference Bashir S, Qamar U, Khan FH. IntelliHealth: a medical decision support application using a novel weighted multi-layer classifier ensemble framework. J Biomed Inform. 2016;59:185–200.CrossRef Bashir S, Qamar U, Khan FH. IntelliHealth: a medical decision support application using a novel weighted multi-layer classifier ensemble framework. J Biomed Inform. 2016;59:185–200.CrossRef
12.
go back to reference Ozcift A, Gulten A. Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms. Comput Methods Programs Biomed. 2011;104(3):443–51.CrossRef Ozcift A, Gulten A. Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms. Comput Methods Programs Biomed. 2011;104(3):443–51.CrossRef
13.
go back to reference Sudharsan B, Peeples M, Shomali M. Hypoglycemia prediction using machine learning models for patients with type 2 diabetes. J Diabetes Sci Technol. 2015;9(1):86–90.CrossRef Sudharsan B, Peeples M, Shomali M. Hypoglycemia prediction using machine learning models for patients with type 2 diabetes. J Diabetes Sci Technol. 2015;9(1):86–90.CrossRef
14.
go back to reference Olaleye D, Perkins BA, Bril V. Evaluation of three screening tests and a risk assessment model for diagnosing peripheral neuropathy in the diabetes clinic. Diabetes Res Clin Pract. 2001;54(2):115–28.CrossRef Olaleye D, Perkins BA, Bril V. Evaluation of three screening tests and a risk assessment model for diagnosing peripheral neuropathy in the diabetes clinic. Diabetes Res Clin Pract. 2001;54(2):115–28.CrossRef
15.
go back to reference Li CP, et al. Performance comparison between logistic regression, decision trees, and multilayer perceptron in predicting peripheral neuropathy in type 2 diabetes mellitus. Chin Med J (Engl). 2012;125(5):851–7. Li CP, et al. Performance comparison between logistic regression, decision trees, and multilayer perceptron in predicting peripheral neuropathy in type 2 diabetes mellitus. Chin Med J (Engl). 2012;125(5):851–7.
16.
go back to reference Dagliati A, et al. Machine learning methods to predict diabetes complications. J Diabetes Sci Technol. 2018;12:295–302.CrossRef Dagliati A, et al. Machine learning methods to predict diabetes complications. J Diabetes Sci Technol. 2018;12:295–302.CrossRef
17.
go back to reference Huang G-M, Huang K-Y, Lee T-Y, Weng J. An interpretable rule-based diagnostic classification of diabetic nephropathy among type 2 diabetes patients. BMC Bioinformatics. 2015;16:S5.CrossRef Huang G-M, Huang K-Y, Lee T-Y, Weng J. An interpretable rule-based diagnostic classification of diabetic nephropathy among type 2 diabetes patients. BMC Bioinformatics. 2015;16:S5.CrossRef
18.
go back to reference Lagani V, et al. Development and validation of risk assessment models for diabetes-related complications based on the DCCT/EDIC data. J Diabetes Complications. 2015;29(4):479–87.CrossRef Lagani V, et al. Development and validation of risk assessment models for diabetes-related complications based on the DCCT/EDIC data. J Diabetes Complications. 2015;29(4):479–87.CrossRef
19.
go back to reference Lagani V, et al. Realization of a service for the long-term risk assessment of diabetes-related complications. J Diabetes Complications. 2015;29(5):691–8.CrossRef Lagani V, et al. Realization of a service for the long-term risk assessment of diabetes-related complications. J Diabetes Complications. 2015;29(5):691–8.CrossRef
20.
go back to reference Pedregosa F, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825–30. Pedregosa F, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825–30.
21.
go back to reference Oliveira FHM, MacHado ARP, Andrade AO. On the use of t -distributed stochastic neighbor embedding for data visualization and classification of individuals with Parkinson’s disease. Comput Math Methods Med. 2018;2018:8019232.CrossRef Oliveira FHM, MacHado ARP, Andrade AO. On the use of t -distributed stochastic neighbor embedding for data visualization and classification of individuals with Parkinson’s disease. Comput Math Methods Med. 2018;2018:8019232.CrossRef
22.
go back to reference Sikorskiy S, Metsker O, Yakovlev A, Kovalchuk S. Machine learning based text mining in electronic health records: cardiovascular patient cases; 2018. p. 818–24. Sikorskiy S, Metsker O, Yakovlev A, Kovalchuk S. Machine learning based text mining in electronic health records: cardiovascular patient cases; 2018. p. 818–24.
23.
go back to reference Ribeiro MT, Singh S, Guestrin C. Why Should I Trust You?’: Explaining the Predictions of Any Classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13–17, 2016; 2016. p. 1135–44. Ribeiro MT, Singh S, Guestrin C. Why Should I Trust You?’: Explaining the Predictions of Any Classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13–17, 2016; 2016. p. 1135–44.
24.
go back to reference Martin J, Bath PM, Burr M. Influence of platelet size on outcome after myocardial infarction. Lancet. 1991;338:1409–11 Elsevier.CrossRef Martin J, Bath PM, Burr M. Influence of platelet size on outcome after myocardial infarction. Lancet. 1991;338:1409–11 Elsevier.CrossRef
25.
go back to reference Coban E, Bostan F, Ozdogan M. The mean platelet volume in subjects with impaired fasting glucose. Platelets. 2006;17(1):67–9.CrossRef Coban E, Bostan F, Ozdogan M. The mean platelet volume in subjects with impaired fasting glucose. Platelets. 2006;17(1):67–9.CrossRef
26.
go back to reference Demirtunc R, Duman D, Basar M, et al. The relationship between glycemic control and platelet activity in type 2 diabetes mellitus. J Diabetes Complications. 2009;23:89–94 Elsevier.CrossRef Demirtunc R, Duman D, Basar M, et al. The relationship between glycemic control and platelet activity in type 2 diabetes mellitus. J Diabetes Complications. 2009;23:89–94 Elsevier.CrossRef
27.
go back to reference Ziegler D, Siekierka-Kleiser E, et al. Validation of a novel screening device (NeuroQuick) for quantitative assessment of small nerve fiber dysfunction as an early feature of diabetic polyneuropathy. Am Diabetes Assoc. 2005;28:1169–74. Ziegler D, Siekierka-Kleiser E, et al. Validation of a novel screening device (NeuroQuick) for quantitative assessment of small nerve fiber dysfunction as an early feature of diabetic polyneuropathy. Am Diabetes Assoc. 2005;28:1169–74.
28.
go back to reference Lewis EJH, et al. Rapid corneal nerve fiber loss: a marker of diabetic neuropathy onset and progression. Diabetes Care. 2020;43:dc190951.CrossRef Lewis EJH, et al. Rapid corneal nerve fiber loss: a marker of diabetic neuropathy onset and progression. Diabetes Care. 2020;43:dc190951.CrossRef
29.
go back to reference Zakrzewski J, Zakrzewska K, Pluta K, Nowak O, Miloszewska-Paluch A. Ultrasound elastography in the evaluation of peripheral neuropathies: a systematic review of the literature. Polish J Radiol. 2019;84:e581–91.CrossRef Zakrzewski J, Zakrzewska K, Pluta K, Nowak O, Miloszewska-Paluch A. Ultrasound elastography in the evaluation of peripheral neuropathies: a systematic review of the literature. Polish J Radiol. 2019;84:e581–91.CrossRef
30.
go back to reference Groener JB, Jende JME, Kurz FT, et al. Understanding diabetic neuropathy: from subclinical nerve lesions to severe nerve Fiber deficits: a cross-sectional study in patients with type 2 diabetes and healthy controls. Diabetes. 2020;69(3):436–47. https://doi.org/10.2337/db19-0197. Groener JB, Jende JME, Kurz FT, et al. Understanding diabetic neuropathy: from subclinical nerve lesions to severe nerve Fiber deficits: a cross-sectional study in patients with type 2 diabetes and healthy controls. Diabetes. 2020;69(3):436–47. https://​doi.​org/​10.​2337/​db19-0197.
32.
go back to reference Alam U, Sloan G, Tesfaye S. Treating pain in diabetic neuropathy: current and developmental drugs. Drugs. 2020;80(4):363–84.CrossRef Alam U, Sloan G, Tesfaye S. Treating pain in diabetic neuropathy: current and developmental drugs. Drugs. 2020;80(4):363–84.CrossRef
33.
go back to reference Murdoch WJ, Singh C, Kumbier K, Abbasi-Asl R, Yu B. Interpretable machine learning: definitions, methods, and applications; 2019. p. 1–11. Murdoch WJ, Singh C, Kumbier K, Abbasi-Asl R, Yu B. Interpretable machine learning: definitions, methods, and applications; 2019. p. 1–11.
34.
go back to reference Fitri A, Sjahrir H, Bachtiar A, Ichwan M, Fitri FI, Rambe AS. Predictive model of diabetic polyneuropathy severity based on vitamin D level. Open Access Maced J Med Sci. 2019;7(16):2626–9.CrossRef Fitri A, Sjahrir H, Bachtiar A, Ichwan M, Fitri FI, Rambe AS. Predictive model of diabetic polyneuropathy severity based on vitamin D level. Open Access Maced J Med Sci. 2019;7(16):2626–9.CrossRef
35.
go back to reference Kazemi M, Moghimbeigi A, Kiani J, Mahjub H, Faradmal J. Diabetic peripheral neuropathy class prediction by multicategory support vector machine model: a cross-sectional study. Epidemiol Health. 2016;38:e2016011.CrossRef Kazemi M, Moghimbeigi A, Kiani J, Mahjub H, Faradmal J. Diabetic peripheral neuropathy class prediction by multicategory support vector machine model: a cross-sectional study. Epidemiol Health. 2016;38:e2016011.CrossRef
Metadata
Title
Identification of risk factors for patients with diabetes: diabetic polyneuropathy case study
Authors
Oleg Metsker
Kirill Magoev
Alexey Yakovlev
Stanislav Yanishevskiy
Georgy Kopanitsa
Sergey Kovalchuk
Valeria V. Krzhizhanovskaya
Publication date
01-12-2020
Publisher
BioMed Central
Keyword
Polyneuropathy
Published in
BMC Medical Informatics and Decision Making / Issue 1/2020
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/s12911-020-01215-w

Other articles of this Issue 1/2020

BMC Medical Informatics and Decision Making 1/2020 Go to the issue