Skip to main content
Top
Published in: International Journal of Diabetes in Developing Countries 2/2016

01-06-2016 | Original Article

Comparison of various classification algorithms in the diagnosis of type 2 diabetes in Iran

Authors: Mahmoud Heydari, Mehdi Teimouri, Zainabolhoda Heshmati, Seyed Mohammad Alavinia

Published in: International Journal of Diabetes in Developing Countries | Issue 2/2016

Login to get access

Abstract

In today’s medical world, data on symptoms of patients with various diseases are so widespread, that analysis and consideration of all factors is merely not possible by a person (doctor). Therefore, the need for an intelligent system to consider the various factors and identify a suitable model between the different parameters is evident. Knowledge of data mining, as the foundation of such systems, has played a vital role in the advancement of medical sciences, especially in diagnosis of various diseases. Type 2 diabetes is one of these diseases, which has increased in recent years, which if diagnosed late can lead to serious complications. In this paper, several data mining methods and algorithms have been used and applied to a set of screening data for type 2 diabetes in Tabriz, Iran. The performance of methods such as support vector machine, artificial neural network, decision tree, nearest neighbors, and Bayesian network has been compared in an effort to find the best algorithm for diagnosing this disease. Artificial neural network with an accuracy rate of 97.44 % has the best performance on the chosen dataset. Accuracy rates for support vector machine, decision tree, 5-nearest neighbor, and Bayesian network are 81.19, 95.03, 90.85, and 91.60 %, respectively. The results of the simulations show that the effectiveness of various classification techniques on a dataset depends on the application, as well as the nature and complexity of the dataset used. Moreover, it is not always possible to say that a classification technique will always have the best performance. Therefore, in cases where data mining is used for diagnosis or prediction of diseases, consultation with specialists is inevitable, for selecting the number and type of dataset parameters to obtain the best possible results.
Literature
1.
go back to reference Shaw J, Sicree R, Zimmet P. Global estimates of the prevalence of diabetes for 2010 and 2030. Diabetes Res Clin Pract. 2010;87(1):4–14.CrossRefPubMed Shaw J, Sicree R, Zimmet P. Global estimates of the prevalence of diabetes for 2010 and 2030. Diabetes Res Clin Pract. 2010;87(1):4–14.CrossRefPubMed
2.
go back to reference Prevention and control of non-communicable diseases. WHO Information Note 23 July 2010. Prevention and control of non-communicable diseases. WHO Information Note 23 July 2010.
4.
go back to reference Cerqueira M, Cravioto A, Dianis N, Ghannem H, Levitt A, Yan L. Global response to non-communicable disease. BMJ. 2011;342 (d3823). Cerqueira M, Cravioto A, Dianis N, Ghannem H, Levitt A, Yan L. Global response to non-communicable disease. BMJ. 2011;342 (d3823).
6.
go back to reference IDF Diabetes Atlas. 5th ed. International Diabetes Federation; 2011. IDF Diabetes Atlas. 5th ed. International Diabetes Federation; 2011.
7.
go back to reference Zimmet P. Diabetes epidemiology as a tool to trigger diabetes research and care. Diabetologia. 1999;42(5):499–518.CrossRefPubMed Zimmet P. Diabetes epidemiology as a tool to trigger diabetes research and care. Diabetologia. 1999;42(5):499–518.CrossRefPubMed
8.
go back to reference Hagan MT, Demuth HB, Beale MH. Neural network design. Boston: Pws Pub; 1996. Hagan MT, Demuth HB, Beale MH. Neural network design. Boston: Pws Pub; 1996.
9.
go back to reference Kayaer K, Yıldırım T, editors. Medical diagnosis on Pima Indian diabetes using general regression neural networks. Proceedings of the international conference on artificial neural networks and neural information processing (ICANN/ICONIP); 2003. Kayaer K, Yıldırım T, editors. Medical diagnosis on Pima Indian diabetes using general regression neural networks. Proceedings of the international conference on artificial neural networks and neural information processing (ICANN/ICONIP); 2003.
10.
go back to reference Patil BM, Joshi RC, Toshniwal D. Hybrid prediction model for type-2 diabetic patients. Expert Systems Appl. 2010;37(12):8102–8.CrossRef Patil BM, Joshi RC, Toshniwal D. Hybrid prediction model for type-2 diabetic patients. Expert Systems Appl. 2010;37(12):8102–8.CrossRef
11.
go back to reference Al Jarullah AA, editor. Decision tree discovery for the diagnosis of type II diabetes. Innovations in Information Technology (IIT), 2011 International Conference on; 2011: IEEE. Al Jarullah AA, editor. Decision tree discovery for the diagnosis of type II diabetes. Innovations in Information Technology (IIT), 2011 International Conference on; 2011: IEEE.
12.
go back to reference Osuna E, Freund R, Girosi F. Support vector machines: training and applications. 1997. Osuna E, Freund R, Girosi F. Support vector machines: training and applications. 1997.
13.
go back to reference Cristianini N, Shawe-Taylor J. An introduction to support vector machines and other kernel-based learning methods. Cambridge university press; 2000. Cristianini N, Shawe-Taylor J. An introduction to support vector machines and other kernel-based learning methods. Cambridge university press; 2000.
14.
go back to reference Shao Y-H, Deng N-Y. A coordinate descent margin-based twin support vector machine for classification. Neural Netw. 2012;25:114–21.CrossRefPubMed Shao Y-H, Deng N-Y. A coordinate descent margin-based twin support vector machine for classification. Neural Netw. 2012;25:114–21.CrossRefPubMed
15.
go back to reference Orhan U, Hekim M, Ozer M. EEG signals classification using the K-means clustering and a multilayer perceptron neural network model. Expert Systems Appl. 2011;38(10):13475–81.CrossRef Orhan U, Hekim M, Ozer M. EEG signals classification using the K-means clustering and a multilayer perceptron neural network model. Expert Systems Appl. 2011;38(10):13475–81.CrossRef
16.
go back to reference Yaghini M, Khoshraftar MM, Fallahi M. A hybrid algorithm for artificial neural network training. Eng Appl Artif Intell. 2013;26(1):293–301.CrossRef Yaghini M, Khoshraftar MM, Fallahi M. A hybrid algorithm for artificial neural network training. Eng Appl Artif Intell. 2013;26(1):293–301.CrossRef
17.
go back to reference Temurtas F. A comparative study on thyroid disease diagnosis using neural networks. Expert Systems Appl. 2009;36(1):944–9.CrossRef Temurtas F. A comparative study on thyroid disease diagnosis using neural networks. Expert Systems Appl. 2009;36(1):944–9.CrossRef
18.
go back to reference Witten I, Frank E, Hall M. Data mining: practical machine learning tools and techniques. 3rd edition. San Francisco: Morgan Kaufmann; 2011. Witten I, Frank E, Hall M. Data mining: practical machine learning tools and techniques. 3rd edition. San Francisco: Morgan Kaufmann; 2011.
19.
go back to reference Xing Z, Pei J, Keogh E. A brief survey on sequence classification. ACM SIGKDD Explorations Newsletter. 2010;12(1):40–8.CrossRef Xing Z, Pei J, Keogh E. A brief survey on sequence classification. ACM SIGKDD Explorations Newsletter. 2010;12(1):40–8.CrossRef
20.
go back to reference Nakayama N, Oketani M, Kawamura Y, Inao M, Nagoshi S, Fujiwara K, et al. Algorithm to determine the outcome of patients with acute liver failure: a data-mining analysis using decision trees. J Gastroenterol. 2012;47(6):664–77.CrossRefPubMedPubMedCentral Nakayama N, Oketani M, Kawamura Y, Inao M, Nagoshi S, Fujiwara K, et al. Algorithm to determine the outcome of patients with acute liver failure: a data-mining analysis using decision trees. J Gastroenterol. 2012;47(6):664–77.CrossRefPubMedPubMedCentral
21.
go back to reference Setsirichok D, Piroonratana T, Wongseree W, Usavanarong T, Paulkhaolarn N, Kanjanakorn C, et al. Classification of complete blood count and haemoglobin-typing data by a C4.5 decision tree, a naïve Bayes classifier and a multilayer perceptron for thalassaemia screening. Biomedical Signal Processing and Control. 2012;7(2):202–12.CrossRef Setsirichok D, Piroonratana T, Wongseree W, Usavanarong T, Paulkhaolarn N, Kanjanakorn C, et al. Classification of complete blood count and haemoglobin-typing data by a C4.5 decision tree, a naïve Bayes classifier and a multilayer perceptron for thalassaemia screening. Biomedical Signal Processing and Control. 2012;7(2):202–12.CrossRef
22.
go back to reference Kurt I, Ture M, Kurum AT. Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease. Expert Syst Appl. 2008;34(1):366–74.CrossRef Kurt I, Ture M, Kurum AT. Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease. Expert Syst Appl. 2008;34(1):366–74.CrossRef
23.
go back to reference Olson DL, Delen D. Advanced data mining techniques [electronic resource]. Springer; 2008. Olson DL, Delen D. Advanced data mining techniques [electronic resource]. Springer; 2008.
24.
go back to reference Karthikeyani V, Begum IP. Comparison a performance of data mining algorithms (CPDMA) in prediction of diabetes disease. International Journal. 2013. Karthikeyani V, Begum IP. Comparison a performance of data mining algorithms (CPDMA) in prediction of diabetes disease. International Journal. 2013.
25.
go back to reference Huang C-L, Wang C-J. A GA-based feature selection and parameters optimization for support vector machines. Expert Sys Appl. 2006;31(2):231–40.CrossRef Huang C-L, Wang C-J. A GA-based feature selection and parameters optimization for support vector machines. Expert Sys Appl. 2006;31(2):231–40.CrossRef
26.
go back to reference Kahramanli H, Allahverdi N. Design of a hybrid system for the diabetes and heart diseases. Expert Sys Appl. 2008;35(1):82–9.CrossRef Kahramanli H, Allahverdi N. Design of a hybrid system for the diabetes and heart diseases. Expert Sys Appl. 2008;35(1):82–9.CrossRef
27.
go back to reference Khashei M, Zeinal Hamadani A, Bijari M. A novel hybrid classification model of artificial neural networks and multiple linear regression models. Expert Systems Appl. 2012;39(3):2606–20.CrossRef Khashei M, Zeinal Hamadani A, Bijari M. A novel hybrid classification model of artificial neural networks and multiple linear regression models. Expert Systems Appl. 2012;39(3):2606–20.CrossRef
28.
go back to reference Khashei M, Eftekhari S, Parvizian J. Diagnosing diabetes type II using a soft intelligent binary classification model. Review of Bioinformatics and Biometrics. 2012;1 (1). Khashei M, Eftekhari S, Parvizian J. Diagnosing diabetes type II using a soft intelligent binary classification model. Review of Bioinformatics and Biometrics. 2012;1 (1).
29.
go back to reference Ibrikci T, Ustun D, Kaya IE. Diagnosis of several diseases by using combined kernels with support vector machine. J Med Syst. 2012;36(3):1831–40.CrossRefPubMed Ibrikci T, Ustun D, Kaya IE. Diagnosis of several diseases by using combined kernels with support vector machine. J Med Syst. 2012;36(3):1831–40.CrossRefPubMed
30.
go back to reference Karegowda AG, Manjunath A, Jayaram M. Application of genetic algorithm optimized neural network connection weights for medical diagnosis of Pima Indians diabetes. Int J Soft Computing. 2011;2(2):15–23.CrossRef Karegowda AG, Manjunath A, Jayaram M. Application of genetic algorithm optimized neural network connection weights for medical diagnosis of Pima Indians diabetes. Int J Soft Computing. 2011;2(2):15–23.CrossRef
Metadata
Title
Comparison of various classification algorithms in the diagnosis of type 2 diabetes in Iran
Authors
Mahmoud Heydari
Mehdi Teimouri
Zainabolhoda Heshmati
Seyed Mohammad Alavinia
Publication date
01-06-2016
Publisher
Springer India
Published in
International Journal of Diabetes in Developing Countries / Issue 2/2016
Print ISSN: 0973-3930
Electronic ISSN: 1998-3832
DOI
https://doi.org/10.1007/s13410-015-0374-4

Other articles of this Issue 2/2016

International Journal of Diabetes in Developing Countries 2/2016 Go to the issue