Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2018

Open Access 01-12-2018 | Research article

Machine learning methodologies versus cardiovascular risk scores, in predicting disease risk

Authors: Alexandros C. Dimopoulos, Mara Nikolaidou, Francisco Félix Caballero, Worrawat Engchuan, Albert Sanchez-Niubo, Holger Arndt, José Luis Ayuso-Mateos, Josep Maria Haro, Somnath Chatterji, Ekavi N. Georgousopoulou, Christos Pitsavos, Demosthenes B. Panagiotakos

Published in: BMC Medical Research Methodology | Issue 1/2018

Login to get access

Abstract

Background

The use of Cardiovascular Disease (CVD) risk estimation scores in primary prevention has long been established. However, their performance still remains a matter of concern. The aim of this study was to explore the potential of using ML methodologies on CVD prediction, especially compared to established risk tool, the HellenicSCORE.

Methods

Data from the ATTICA prospective study (n = 2020 adults), enrolled during 2001–02 and followed-up in 2011–12 were used. Three different machine-learning classifiers (k-NN, random forest, and decision tree) were trained and evaluated against 10-year CVD incidence, in comparison with the HellenicSCORE tool (a calibration of the ESC SCORE). Training datasets, consisting from 16 variables to only 5 variables, were chosen, with or without bootstrapping, in an attempt to achieve the best overall performance for the machine learning classifiers.

Results

Depending on the classifier and the training dataset the outcome varied in efficiency but was comparable between the two methodological approaches. In particular, the HellenicSCORE showed accuracy 85%, specificity 20%, sensitivity 97%, positive predictive value 87%, and negative predictive value 58%, whereas for the machine learning methodologies, accuracy ranged from 65 to 84%, specificity from 46 to 56%, sensitivity from 67 to 89%, positive predictive value from 89 to 91%, and negative predictive value from 24 to 45%; random forest gave the best results, while the k-NN gave the poorest results.

Conclusions

The alternative approach of machine learning classification produced results comparable to that of risk prediction scores and, thus, it can be used as a method of CVD prediction, taking into consideration the advantages that machine learning methodologies may offer.
Literature
1.
go back to reference Benjamin-Chung J, Abedin J, Berger D, Clark A, Jimenez V, Konagaya E, Tran D, Arnold BF, Hubbard AE, Luby SP, Miguel E, Colford JM. Spillover effects on health outcomes in low- and middle-income countries: a systematic review. Int J Epidemiol. 2017. https://doi.org/10.1093/ije/dyx039. Benjamin-Chung J, Abedin J, Berger D, Clark A, Jimenez V, Konagaya E, Tran D, Arnold BF, Hubbard AE, Luby SP, Miguel E, Colford JM. Spillover effects on health outcomes in low- and middle-income countries: a systematic review. Int J Epidemiol. 2017. https://​doi.​org/​10.​1093/​ije/​dyx039.
3.
go back to reference Klenk J, Keil U, Jaensch A, Christiansen MC, Nagel G. Changes in life expectancy 1950–2010: contributions from age-and disease-specific mortality in selected countries. Popul Health Metrics. 2016;14(1):20.CrossRef Klenk J, Keil U, Jaensch A, Christiansen MC, Nagel G. Changes in life expectancy 1950–2010: contributions from age-and disease-specific mortality in selected countries. Popul Health Metrics. 2016;14(1):20.CrossRef
4.
go back to reference Araújo F, Gouvinhas C, Fontes F, La Vecchia C, Azevedo A, Lunet N. Trends in cardiovascular diseases and cancer mortality in 45 countries from five continents (1980–2010). Eur J Prev Cardiol. 2014;21(8):1004–17.PubMedCrossRef Araújo F, Gouvinhas C, Fontes F, La Vecchia C, Azevedo A, Lunet N. Trends in cardiovascular diseases and cancer mortality in 45 countries from five continents (1980–2010). Eur J Prev Cardiol. 2014;21(8):1004–17.PubMedCrossRef
6.
go back to reference Harding S, Silva MJ, Molaodi OR, Enayat ZE, Cassidy A, Karamanos A, Read UM, Cruickshank JK. Longitudinal study of cardiometabolic risk from early adolescence to early adulthood in an ethnically diverse cohort. BMJ Open. 2016;6(12):013221. Harding S, Silva MJ, Molaodi OR, Enayat ZE, Cassidy A, Karamanos A, Read UM, Cruickshank JK. Longitudinal study of cardiometabolic risk from early adolescence to early adulthood in an ethnically diverse cohort. BMJ Open. 2016;6(12):013221.
7.
go back to reference Cooney MT, Dudina AL, Graham IM. Value and limitations of existing scores for the assessment of cardiovascular risk: a review for clinicians. J Am Coll Cardiol. 2009;54(14):1209–27.PubMedCrossRef Cooney MT, Dudina AL, Graham IM. Value and limitations of existing scores for the assessment of cardiovascular risk: a review for clinicians. J Am Coll Cardiol. 2009;54(14):1209–27.PubMedCrossRef
8.
go back to reference Kannel WB, McGee D, Gordon T. A general cardiovascular risk profile: the Framingham study. Am J Cardiol. 1976;38(1):46–51.PubMedCrossRef Kannel WB, McGee D, Gordon T. A general cardiovascular risk profile: the Framingham study. Am J Cardiol. 1976;38(1):46–51.PubMedCrossRef
9.
go back to reference Menotti A, Puddu P, Lanti M. Comparison of the Framingham risk function-based coronary chart with risk function from an italian population study. Eur Heart J. 2000;21(5):365–70.PubMedCrossRef Menotti A, Puddu P, Lanti M. Comparison of the Framingham risk function-based coronary chart with risk function from an italian population study. Eur Heart J. 2000;21(5):365–70.PubMedCrossRef
10.
go back to reference Conroy R, Pyörälä K, Fitzgerald Ae, Sans S, Menotti A, De Backer G, De Bacquer D, Ducimetiere P, Jousilahti P, Keil U, et al. Estimation of ten-year risk of fatal cardiovascular disease in Europe: the SCORE project. Eur Heart J. 2003;24(11):987–1003.PubMedCrossRef Conroy R, Pyörälä K, Fitzgerald Ae, Sans S, Menotti A, De Backer G, De Bacquer D, Ducimetiere P, Jousilahti P, Keil U, et al. Estimation of ten-year risk of fatal cardiovascular disease in Europe: the SCORE project. Eur Heart J. 2003;24(11):987–1003.PubMedCrossRef
11.
go back to reference Panagiotakos DB, Fitzgerald AP, Pitsavos C, Pipilis A, Graham I, Stefanadis C. Statistical modelling of 10-year fatal cardiovascular disease risk in Greece: the HellenicSCORE (a calibration of the ESC SCORE project). Hell J Cardiol. 2007;48(2):55–63. Panagiotakos DB, Fitzgerald AP, Pitsavos C, Pipilis A, Graham I, Stefanadis C. Statistical modelling of 10-year fatal cardiovascular disease risk in Greece: the HellenicSCORE (a calibration of the ESC SCORE project). Hell J Cardiol. 2007;48(2):55–63.
13.
go back to reference Cooney MT, Cooney HC, Dudina A, Graham IM. Total cardiovascular disease risk assessment: a review. Curr Opin Cardiol. 2011;26(5):429–37.PubMedCrossRef Cooney MT, Cooney HC, Dudina A, Graham IM. Total cardiovascular disease risk assessment: a review. Curr Opin Cardiol. 2011;26(5):429–37.PubMedCrossRef
14.
go back to reference Graham IM, Cooney M-T, Dudina A, Squarta S. What is my risk of developing cardiovascular disease? Eur J Cardiovasc Prev Rehabil. 2009;16(2_suppl):2–7.CrossRef Graham IM, Cooney M-T, Dudina A, Squarta S. What is my risk of developing cardiovascular disease? Eur J Cardiovasc Prev Rehabil. 2009;16(2_suppl):2–7.CrossRef
15.
go back to reference D’Agostino RB Sr, Grundy S, Sullivan LM, Wilson P, Group, C.R.P, et al. Validation of the Framingham coronary heart disease prediction scores: results of a multiple ethnic groups investigation. JAMA. 2001;286(2):180–7.PubMedCrossRef D’Agostino RB Sr, Grundy S, Sullivan LM, Wilson P, Group, C.R.P, et al. Validation of the Framingham coronary heart disease prediction scores: results of a multiple ethnic groups investigation. JAMA. 2001;286(2):180–7.PubMedCrossRef
16.
go back to reference Georgousopoulou EN, Pitsavos C, Yannakoulia CM, Panagiotakos DB. Comparisons between survival models in predicting cardiovascular disease events: application in the Attica study (2002-2012). J Stat Appl Probab. 2015;4(2):203. Georgousopoulou EN, Pitsavos C, Yannakoulia CM, Panagiotakos DB. Comparisons between survival models in predicting cardiovascular disease events: application in the Attica study (2002-2012). J Stat Appl Probab. 2015;4(2):203.
17.
go back to reference Kononenko I. Machine learning for medical diagnosis: history, state of the art and perspective. Artif Intell Med. 2001;23(1):89–109.PubMedCrossRef Kononenko I. Machine learning for medical diagnosis: history, state of the art and perspective. Artif Intell Med. 2001;23(1):89–109.PubMedCrossRef
18.
go back to reference Yoo I, Alafaireet P, Marinov M, Pena-Hernandez K, Gopidi R, Chang J-F, Hua L. Data mining in healthcare and biomedicine: a survey of the literature. J Med Syst. 2012;36(4):2431–48.PubMedCrossRef Yoo I, Alafaireet P, Marinov M, Pena-Hernandez K, Gopidi R, Chang J-F, Hua L. Data mining in healthcare and biomedicine: a survey of the literature. J Med Syst. 2012;36(4):2431–48.PubMedCrossRef
19.
go back to reference Kourou K, Exarchos TP, Exarchos KP, Karamouzis MV, Fotiadis DI. Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J. 2015;13:8–17.PubMedCrossRef Kourou K, Exarchos TP, Exarchos KP, Karamouzis MV, Fotiadis DI. Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J. 2015;13:8–17.PubMedCrossRef
21.
23.
go back to reference Michie D, Spiegelhalter DJ, Taylor CC. Machine learning, neural and statistical classification; 1994. Michie D, Spiegelhalter DJ, Taylor CC. Machine learning, neural and statistical classification; 1994.
24.
go back to reference Venables WN, Ripley BD. Modern applied statistics with S-PLUS. New York : Springer; 2013. Venables WN, Ripley BD. Modern applied statistics with S-PLUS. New York : Springer; 2013.
26.
go back to reference Liaw A, Wiener M. Classification and regression by randomforest. R News. 2002;2(3):18–22. Liaw A, Wiener M. Classification and regression by randomforest. R News. 2002;2(3):18–22.
27.
go back to reference Pitsavos C, Panagiotakos DB, Chrysohoou C, Stefanadis C. Epidemiology of cardiovascular risk factors in Greece: aims, design and baseline characteristics of the ATTICA study. BMC Public Health. 2003;3(1):32.PubMedPubMedCentralCrossRef Pitsavos C, Panagiotakos DB, Chrysohoou C, Stefanadis C. Epidemiology of cardiovascular risk factors in Greece: aims, design and baseline characteristics of the ATTICA study. BMC Public Health. 2003;3(1):32.PubMedPubMedCentralCrossRef
28.
go back to reference Panagiotakos D, Pitsavos C, Chrysohoou C, Palliou K, Lentzas I, Skoumas I, Stefanadis C. Dietary patterns and 5-year incidence of cardiovascular disease: a multivariate analysis of the ATTICA study. Nutr Metab Cardiovasc Dis. 2009;19(4):253–63.PubMedCrossRef Panagiotakos D, Pitsavos C, Chrysohoou C, Palliou K, Lentzas I, Skoumas I, Stefanadis C. Dietary patterns and 5-year incidence of cardiovascular disease: a multivariate analysis of the ATTICA study. Nutr Metab Cardiovasc Dis. 2009;19(4):253–63.PubMedCrossRef
29.
go back to reference Panagiotakos DB, Georgousopoulou EN, Pitsavos C, Chrysohoou C, Metaxa V, Georgiopoulos GA, Kalogeropoulou K, Tousoulis D, Stefanadis C, group, A.S, et al. Ten-year (2002–2012) cardiovascular disease incidence and all-cause mortality, in urban Greek population: the ATTICA study. Int J Cardiol. 2015;180:178–84.PubMedCrossRef Panagiotakos DB, Georgousopoulou EN, Pitsavos C, Chrysohoou C, Metaxa V, Georgiopoulos GA, Kalogeropoulou K, Tousoulis D, Stefanadis C, group, A.S, et al. Ten-year (2002–2012) cardiovascular disease incidence and all-cause mortality, in urban Greek population: the ATTICA study. Int J Cardiol. 2015;180:178–84.PubMedCrossRef
30.
go back to reference Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tibshirani R, Botstein D, Altman RB. Missing value estimation methods for DNA microarrays. Bioinformatics. 2001;17(6):520–5.PubMedCrossRef Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tibshirani R, Botstein D, Altman RB. Missing value estimation methods for DNA microarrays. Bioinformatics. 2001;17(6):520–5.PubMedCrossRef
31.
go back to reference Vlismas K, Panagiotakos DB, Pitsavos C, Chrysohoou C, Skoumas Y, Stavrinos V, Stefanadis C. The role of dietary and socioeconomic status assessment on the predictive ability of the HellenicSCORE. Hell J Cardiol. 2011;52(5):391–8. Vlismas K, Panagiotakos DB, Pitsavos C, Chrysohoou C, Skoumas Y, Stavrinos V, Stefanadis C. The role of dietary and socioeconomic status assessment on the predictive ability of the HellenicSCORE. Hell J Cardiol. 2011;52(5):391–8.
33.
go back to reference Organization, W.H. The ICD-10 classification of mental and behavioural disorders: clinical descriptions and diagnostic guidelines. Geneva : World Health Organization; 1992. Organization, W.H. The ICD-10 classification of mental and behavioural disorders: clinical descriptions and diagnostic guidelines. Geneva : World Health Organization; 1992.
34.
go back to reference Domingos P, Pazzani M. On the optimality of the simple bayesian classifier under zero-one loss. Mach Learn. 1997;29(2):103–30.CrossRef Domingos P, Pazzani M. On the optimality of the simple bayesian classifier under zero-one loss. Mach Learn. 1997;29(2):103–30.CrossRef
35.
go back to reference Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20(3):273–97. Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20(3):273–97.
37.
go back to reference Cover T, Hart P. Nearest neighbor pattern classification. IEEE Trans Inf Theory. 1967;13(1):21–7.CrossRef Cover T, Hart P. Nearest neighbor pattern classification. IEEE Trans Inf Theory. 1967;13(1):21–7.CrossRef
38.
go back to reference Lin Y, Jeon Y. Random forests and adaptive nearest neighbors. J Am Stat Assoc. 2006;101(474):578–90.CrossRef Lin Y, Jeon Y. Random forests and adaptive nearest neighbors. J Am Stat Assoc. 2006;101(474):578–90.CrossRef
39.
go back to reference Breiman L, Cutler A. Random forests-classification description. Berkeley 2: Department of Statistics; 2007. Breiman L, Cutler A. Random forests-classification description. Berkeley 2: Department of Statistics; 2007.
40.
go back to reference Steurer J, Fischer JE, Bachmann LM, Koller M, ter Riet G. Communicating accuracy of tests to general practitioners: a controlled study. BMJ. 2002;324(7341):824–6.PubMedPubMedCentralCrossRef Steurer J, Fischer JE, Bachmann LM, Koller M, ter Riet G. Communicating accuracy of tests to general practitioners: a controlled study. BMJ. 2002;324(7341):824–6.PubMedPubMedCentralCrossRef
41.
go back to reference R Core Team. R: a language and environment for statistical computing. R Foundation for Statistical, Computing, Vienna, Austria: R Foundation for Statistical Computing; 2017. https://www.R-project.org/ R Core Team. R: a language and environment for statistical computing. R Foundation for Statistical, Computing, Vienna, Austria: R Foundation for Statistical Computing; 2017. https://​www.​R-project.​org/​
42.
go back to reference Geman S, Bienenstock E, Doursat R. Neural networks and the bias/variance dilemma. Neural Netw. 2008;4(1):1–58. Geman S, Bienenstock E, Doursat R. Neural networks and the bias/variance dilemma. Neural Netw. 2008;4(1):1–58.
43.
go back to reference Dybowski R, Gant V, Weller P, Chang R. Prediction of outcome in critically ill patients using artificial neural network synthesised by genetic algorithm. Lancet. 1996;347(9009):1146–50.PubMedCrossRef Dybowski R, Gant V, Weller P, Chang R. Prediction of outcome in critically ill patients using artificial neural network synthesised by genetic algorithm. Lancet. 1996;347(9009):1146–50.PubMedCrossRef
44.
go back to reference Voss R, Cullen P, Schulte H, Assmann G. Prediction of risk of coronary events in middle-aged men in the prospective cardiovascular münster study (procam) using neural networks. Int J Epidemiol. 2002;31(6):1253–62.PubMedCrossRef Voss R, Cullen P, Schulte H, Assmann G. Prediction of risk of coronary events in middle-aged men in the prospective cardiovascular münster study (procam) using neural networks. Int J Epidemiol. 2002;31(6):1253–62.PubMedCrossRef
Metadata
Title
Machine learning methodologies versus cardiovascular risk scores, in predicting disease risk
Authors
Alexandros C. Dimopoulos
Mara Nikolaidou
Francisco Félix Caballero
Worrawat Engchuan
Albert Sanchez-Niubo
Holger Arndt
José Luis Ayuso-Mateos
Josep Maria Haro
Somnath Chatterji
Ekavi N. Georgousopoulou
Christos Pitsavos
Demosthenes B. Panagiotakos
Publication date
01-12-2018
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2018
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/s12874-018-0644-1

Other articles of this Issue 1/2018

BMC Medical Research Methodology 1/2018 Go to the issue