Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2019

Open Access 01-12-2019 | Hypertension | Research article

On the interpretability of machine learning-based model for predicting hypertension

Authors: Radwa Elshawi, Mouaz H. Al-Mallah, Sherif Sakr

Published in: BMC Medical Informatics and Decision Making | Issue 1/2019

Login to get access

Abstract

Background

Although complex machine learning models are commonly outperforming the traditional simple interpretable models, clinicians find it hard to understand and trust these complex models due to the lack of intuition and explanation of their predictions. The aim of this study to demonstrate the utility of various model-agnostic explanation techniques of machine learning models with a case study for analyzing the outcomes of the machine learning random forest model for predicting the individuals at risk of developing hypertension based on cardiorespiratory fitness data.

Methods

The dataset used in this study contains information of 23,095 patients who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems between 1991 and 2009 and had a complete 10-year follow-up. Five global interpretability techniques (Feature Importance, Partial Dependence Plot, Individual Conditional Expectation, Feature Interaction, Global Surrogate Models) and two local interpretability techniques (Local Surrogate Models, Shapley Value) have been applied to present the role of the interpretability techniques on assisting the clinical staff to get better understanding and more trust of the outcomes of the machine learning-based predictions.

Results

Several experiments have been conducted and reported. The results show that different interpretability techniques can shed light on different insights on the model behavior where global interpretations can enable clinicians to understand the entire conditional distribution modeled by the trained response function. In contrast, local interpretations promote the understanding of small parts of the conditional distribution for specific instances.

Conclusions

Various interpretability techniques can vary in their explanations for the behavior of the machine learning model. The global interpretability techniques have the advantage that it can generalize over the entire population while local interpretability techniques focus on giving explanations at the level of instances. Both methods can be equally valid depending on the application need. Both methods are effective methods for assisting clinicians on the medical decision process, however, the clinicians will always remain to hold the final say on accepting or rejecting the outcome of the machine learning models and their explanations based on their domain expertise.
Literature
1.
go back to reference Caruana R, Lou Y, Gehrke J, Koch P, Sturm M, Elhadad N. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining: ACM; 2015. p. 1721–30. https://dl.acm.org/citation.cfm?id=2788613. Caruana R, Lou Y, Gehrke J, Koch P, Sturm M, Elhadad N. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining: ACM; 2015. p. 1721–30. https://​dl.​acm.​org/​citation.​cfm?​id=​2788613.
2.
go back to reference Abdul A, Vermeulen J, Wang D, Lim BY, Kankanhalli M. Trends and trajectories for explainable, accountable and intelligible systems: An hci research agenda. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems: ACM; 2018. p. 582. https://dl.acm.org/citation.cfm?id=3174156. Abdul A, Vermeulen J, Wang D, Lim BY, Kankanhalli M. Trends and trajectories for explainable, accountable and intelligible systems: An hci research agenda. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems: ACM; 2018. p. 582. https://​dl.​acm.​org/​citation.​cfm?​id=​3174156.
4.
go back to reference Kononenko I. Machine learning for medical diagnosis: history, state of the art and perspective. Artif Intell Med. 2001;23(1):89–109.PubMedCrossRef Kononenko I. Machine learning for medical diagnosis: history, state of the art and perspective. Artif Intell Med. 2001;23(1):89–109.PubMedCrossRef
7.
go back to reference Darcy AM, Louie AK, Roberts LW. Machine learning and the profession of medicine. Jama. 2016;315(6):551–2.PubMedCrossRef Darcy AM, Louie AK, Roberts LW. Machine learning and the profession of medicine. Jama. 2016;315(6):551–2.PubMedCrossRef
8.
go back to reference Singal AG, Rahimi RS, Clark C, Ma Y, Cuthbert JA, Rockey DC, et al. An automated model using electronic medical record data identifies patients with cirrhosis at high risk for readmission. Clin Gastroenterol Hepatol. 2013;11(10):1335–41.PubMedCrossRef Singal AG, Rahimi RS, Clark C, Ma Y, Cuthbert JA, Rockey DC, et al. An automated model using electronic medical record data identifies patients with cirrhosis at high risk for readmission. Clin Gastroenterol Hepatol. 2013;11(10):1335–41.PubMedCrossRef
9.
go back to reference He D, Mathews SC, Kalloo AN, Hutfless S. Mining high-dimensional administrative claims data to predict early hospital readmissions. J Am Med Inform Assoc. 2014;21(2):272–9.PubMedCrossRef He D, Mathews SC, Kalloo AN, Hutfless S. Mining high-dimensional administrative claims data to predict early hospital readmissions. J Am Med Inform Assoc. 2014;21(2):272–9.PubMedCrossRef
10.
go back to reference Pederson JL, Majumdar SR, Forhan M, Johnson JA, McAlister FA, Investigators P, et al. Current depressive symptoms but not history of depression predict hospital readmission or death after discharge from medical wards: a multisite prospective cohort study. Gen Hosp Psychiatry. 2016;39:80–5.PubMedCrossRef Pederson JL, Majumdar SR, Forhan M, Johnson JA, McAlister FA, Investigators P, et al. Current depressive symptoms but not history of depression predict hospital readmission or death after discharge from medical wards: a multisite prospective cohort study. Gen Hosp Psychiatry. 2016;39:80–5.PubMedCrossRef
11.
go back to reference Basu Roy S, Teredesai A, Zolfaghar K, Liu R, Hazel D, Newman S, et al. Dynamic hierarchical classification for patient risk-of-readmission. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining: ACM; 2015. p. 1691–700. https://dl.acm.org/citation.cfm?id=2788585. Basu Roy S, Teredesai A, Zolfaghar K, Liu R, Hazel D, Newman S, et al. Dynamic hierarchical classification for patient risk-of-readmission. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining: ACM; 2015. p. 1691–700. https://​dl.​acm.​org/​citation.​cfm?​id=​2788585.
12.
go back to reference Futoma J, Morris J, Lucas J. A comparison of models for predicting early hospital readmissions. J Biomed Inform. 2015;56:229–38.PubMedCrossRef Futoma J, Morris J, Lucas J. A comparison of models for predicting early hospital readmissions. J Biomed Inform. 2015;56:229–38.PubMedCrossRef
13.
go back to reference Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B. Support vector machines. IEEE Intell Syst Appl. 1998;13(4):18–28.CrossRef Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B. Support vector machines. IEEE Intell Syst Appl. 1998;13(4):18–28.CrossRef
14.
go back to reference Liaw A, Wiener M, et al. Classification and regression by random Forest. R News. 2002;2(3):18–22. Liaw A, Wiener M, et al. Classification and regression by random Forest. R News. 2002;2(3):18–22.
15.
16.
go back to reference Chang CD, Wang CC, Jiang BC. Using data mining techniques for multi-diseases prediction modeling of hypertension and hyperlipidemia by common risk factors. Expert Syst Appl. 2011;38(5):5507–13.CrossRef Chang CD, Wang CC, Jiang BC. Using data mining techniques for multi-diseases prediction modeling of hypertension and hyperlipidemia by common risk factors. Expert Syst Appl. 2011;38(5):5507–13.CrossRef
17.
go back to reference Abdullah AA, Zakaria Z, Mohamad NF. Design and development of fuzzy expertsystem for diagnosis of hypertension. In: Intelligent Systems, Modelling and Simulation (ISMS), 2011 Second International Conference on. IEEE; 2011. p. 113–7.CrossRef Abdullah AA, Zakaria Z, Mohamad NF. Design and development of fuzzy expertsystem for diagnosis of hypertension. In: Intelligent Systems, Modelling and Simulation (ISMS), 2011 Second International Conference on. IEEE; 2011. p. 113–7.CrossRef
18.
go back to reference Farran B, Channanath AM, Behbehani K, Thanaraj TA. Predictive models to assess risk of type 2 diabetes, hypertension and comorbidity: machine-learning algorithms and validation using national health data from Kuwait - a cohort study. BMJ Open. 2013;3(5):e002457.PubMedPubMedCentralCrossRef Farran B, Channanath AM, Behbehani K, Thanaraj TA. Predictive models to assess risk of type 2 diabetes, hypertension and comorbidity: machine-learning algorithms and validation using national health data from Kuwait - a cohort study. BMJ Open. 2013;3(5):e002457.PubMedPubMedCentralCrossRef
20.
go back to reference Goodman B, Flaxman S. European Union regulations on algorithmic decision making and a “right to explanation”. arXiv preprint arXiv:160608813. 2016;. Goodman B, Flaxman S. European Union regulations on algorithmic decision making and a “right to explanation”. arXiv preprint arXiv:160608813. 2016;.
21.
go back to reference Sakr S, Elshawi R, Ahmed A, Qureshi WT, Brawner C, Keteyian S, et al. Using machine learning on cardiorespiratory fitness data for predicting hypertension: the Henry ford ExercIse testing (FIT) project. PLoS One. 2018;13(4):e0195344.PubMedPubMedCentralCrossRef Sakr S, Elshawi R, Ahmed A, Qureshi WT, Brawner C, Keteyian S, et al. Using machine learning on cardiorespiratory fitness data for predicting hypertension: the Henry ford ExercIse testing (FIT) project. PLoS One. 2018;13(4):e0195344.PubMedPubMedCentralCrossRef
22.
go back to reference Neter J, Wasserman W, Kutner MH. Applied linear regression models; 1989. Neter J, Wasserman W, Kutner MH. Applied linear regression models; 1989.
23.
go back to reference Jalali A, Licht DJ, Nataraj C. Application of Decision Tree in the Prediction of Periventricular Leukomalacia (PVL) Occurrence in Neonates After Neonatal Heart Surgery. In: Conference proceedings:... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Conference, vol. 2012. NIH Public Access; 2012. p. 5931. Jalali A, Licht DJ, Nataraj C. Application of Decision Tree in the Prediction of Periventricular Leukomalacia (PVL) Occurrence in Neonates After Neonatal Heart Surgery. In: Conference proceedings:... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Conference, vol. 2012. NIH Public Access; 2012. p. 5931.
25.
go back to reference Brindle JM, Trindade AA, Shah AP, Jokisch DW, Patton PW, Pichardo JC, et al. Linear regression model for predicting patient-specific total skeletal spongiosa volume for use in molecular radiotherapy dosimetry. J Nucl Med. 2006;47(11):1875.PubMed Brindle JM, Trindade AA, Shah AP, Jokisch DW, Patton PW, Pichardo JC, et al. Linear regression model for predicting patient-specific total skeletal spongiosa volume for use in molecular radiotherapy dosimetry. J Nucl Med. 2006;47(11):1875.PubMed
26.
go back to reference Friedman J, Hastie T, Tibshirani R. The elements of statistical learning, vol. 1. New York: Springer series in statistics; 2001. Friedman J, Hastie T, Tibshirani R. The elements of statistical learning, vol. 1. New York: Springer series in statistics; 2001.
27.
go back to reference Ribeiro MT, Singh S, Guestrin C. Model-agnostic interpretability of machine learning. In: ICML Workshop on Human Interpretability in Machine Learning (WHI); 2016. Ribeiro MT, Singh S, Guestrin C. Model-agnostic interpretability of machine learning. In: ICML Workshop on Human Interpretability in Machine Learning (WHI); 2016.
28.
go back to reference Kim B, Glassman E, Johnson B, Shah J. iBCM: Interactive Bayesian Case Model Empowering Humans via Intuitive Interaction; 2015. Kim B, Glassman E, Johnson B, Shah J. iBCM: Interactive Bayesian Case Model Empowering Humans via Intuitive Interaction; 2015.
29.
go back to reference Freitas AA. Comprehensible classification models – a position paper. ACM SIGKDD Explor Newslett. 2014;15(1):1–10.CrossRef Freitas AA. Comprehensible classification models – a position paper. ACM SIGKDD Explor Newslett. 2014;15(1):1–10.CrossRef
31.
go back to reference Sato M, Tsukimoto H. Rule extraction from neural networks via decision tree induction. In: Neural Networks, 2001. Proceedings. IJCNN’01. International Joint Conference on, vol. 3. IEEE; 2001. p. 1870–5. Sato M, Tsukimoto H. Rule extraction from neural networks via decision tree induction. In: Neural Networks, 2001. Proceedings. IJCNN’01. International Joint Conference on, vol. 3. IEEE; 2001. p. 1870–5.
32.
go back to reference Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D. A survey of methods for explaining black box models. ACM Comput Surv. 2018;51(5):93.CrossRef Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D. A survey of methods for explaining black box models. ACM Comput Surv. 2018;51(5):93.CrossRef
33.
go back to reference Koh, P.W., Liang, P.: Understanding black-box predictions via influence functions. arXiv preprint arXiv:1703.04730 (2017). Koh, P.W., Liang, P.: Understanding black-box predictions via influence functions. arXiv preprint arXiv:1703.04730 (2017).
34.
go back to reference Ribeiro MT, Singh S, Guestrin C. Anchors: high-precision model-agnostic explanations. In: AAAI Conference on Artificial Intelligence; 2018. Ribeiro MT, Singh S, Guestrin C. Anchors: high-precision model-agnostic explanations. In: AAAI Conference on Artificial Intelligence; 2018.
35.
go back to reference Towell GG, Shavlik JW. Extracting refined rules from knowledge-based neural networks. Mach Learn. 1993;13(1):71–101. Towell GG, Shavlik JW. Extracting refined rules from knowledge-based neural networks. Mach Learn. 1993;13(1):71–101.
36.
go back to reference Setiono R, Leow WK. Fernn: an algorithm for fast extraction of rules from neural networks. Appl Intell. 2000;12(1–2):15–25.CrossRef Setiono R, Leow WK. Fernn: an algorithm for fast extraction of rules from neural networks. Appl Intell. 2000;12(1–2):15–25.CrossRef
38.
go back to reference K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep inside convolutional networks: Visualising image classification models and saliency maps,” arXiv preprint arXiv:1312.6034, 2013. K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep inside convolutional networks: Visualising image classification models and saliency maps,” arXiv preprint arXiv:1312.6034, 2013.
39.
go back to reference Bach S, Binder A, Montavon G, Klauschen F, Müller KR, Samek W. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one. 2015;10(7):e0130140.PubMedPubMedCentralCrossRef Bach S, Binder A, Montavon G, Klauschen F, Müller KR, Samek W. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one. 2015;10(7):e0130140.PubMedPubMedCentralCrossRef
40.
go back to reference A. Shrikumar, P. Greenside, A. Kundaje, “Learning important features through propagating activation differences,” arXiv preprint arXiv:1704.02685, 2017. A. Shrikumar, P. Greenside, A. Kundaje, “Learning important features through propagating activation differences,” arXiv preprint arXiv:1704.02685, 2017.
41.
go back to reference Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A. Learning deep features for discriminative localization. In: Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on. IEEE; 2016. p. 2921–9.CrossRef Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A. Learning deep features for discriminative localization. In: Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on. IEEE; 2016. p. 2921–9.CrossRef
42.
go back to reference Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556. 2014;. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556. 2014;.
43.
go back to reference Yosinski J, Clune J, Nguyen A, Fuchs T, Lipson H. Understanding neural networks through deep visualization. arXiv preprint arXiv:150606579. 2015;. Yosinski J, Clune J, Nguyen A, Fuchs T, Lipson H. Understanding neural networks through deep visualization. arXiv preprint arXiv:150606579. 2015;.
44.
go back to reference Karpathy A, Johnson J, Fei-Fei L. Visualizing and understanding recurrent networks. arXiv preprint arXiv:150602078. 2015. Karpathy A, Johnson J, Fei-Fei L. Visualizing and understanding recurrent networks. arXiv preprint arXiv:150602078. 2015.
46.
go back to reference Andrews R, Diederich J, Tickle AB. Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowl-Based Syst. 1995;8(6):373–89.CrossRef Andrews R, Diederich J, Tickle AB. Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowl-Based Syst. 1995;8(6):373–89.CrossRef
48.
go back to reference Poulet F. Svm and graphical algorithms: A cooperative approach. In: Data Mining, 2004. ICDM’04. Fourth IEEE International Conference on. IEEE; 2004. p. 499–502. Poulet F. Svm and graphical algorithms: A cooperative approach. In: Data Mining, 2004. ICDM’04. Fourth IEEE International Conference on. IEEE; 2004. p. 499–502.
49.
go back to reference Strumbelj E, Bosnic Z, Kononenko I, Zakotnik B, Kuhar CG. Explanation and reliability of prediction models: the case of breast cancer recurrence. Knowl Inf Syst. 2010;24(2):305–24.CrossRef Strumbelj E, Bosnic Z, Kononenko I, Zakotnik B, Kuhar CG. Explanation and reliability of prediction models: the case of breast cancer recurrence. Knowl Inf Syst. 2010;24(2):305–24.CrossRef
51.
go back to reference Cover T, Hart P. Nearest neighbor pattern classification. IEEE Trans Inf Theory. 1967;13(1):21–7.CrossRef Cover T, Hart P. Nearest neighbor pattern classification. IEEE Trans Inf Theory. 1967;13(1):21–7.CrossRef
52.
go back to reference Al-Mallah MH, Keteyian SJ, Brawner CA, Whelton S, Blaha MJ. Rationale and design of the Henry ford exercise testing project (the FIT project). Clin Cardiol. 2014;37(8):456–61.PubMedPubMedCentralCrossRef Al-Mallah MH, Keteyian SJ, Brawner CA, Whelton S, Blaha MJ. Rationale and design of the Henry ford exercise testing project (the FIT project). Clin Cardiol. 2014;37(8):456–61.PubMedPubMedCentralCrossRef
53.
go back to reference Fisher A, Rudin C, Dominici F. Model Class Reliance: Variable Importance Measures for any Machine Learning Model Class, from the “Rashomon” Perspective. 2018;arXiv preprint arXiv:180101489 Fisher A, Rudin C, Dominici F. Model Class Reliance: Variable Importance Measures for any Machine Learning Model Class, from the “Rashomon” Perspective. 2018;arXiv preprint arXiv:180101489
55.
go back to reference Goldstein A, Kapelner A, Bleich J, Pitkin E. Peeking inside the black box: visualizing statistical learning with plots of individual conditional expectation. J Comput Graph Stat. 2015;24(1):44–65.CrossRef Goldstein A, Kapelner A, Bleich J, Pitkin E. Peeking inside the black box: visualizing statistical learning with plots of individual conditional expectation. J Comput Graph Stat. 2015;24(1):44–65.CrossRef
56.
go back to reference Friedman JH, Popescu BE, et al. Predictive learning via rule ensembles. Ann Appl Stat. 2008;2(3):916–54.CrossRef Friedman JH, Popescu BE, et al. Predictive learning via rule ensembles. Ann Appl Stat. 2008;2(3):916–54.CrossRef
57.
go back to reference Strumbelj E, Kononenko I. Explaining prediction models and individual predictions with feature contributions. Knowl Inf Syst. 2014;41(3):647–65.CrossRef Strumbelj E, Kononenko I. Explaining prediction models and individual predictions with feature contributions. Knowl Inf Syst. 2014;41(3):647–65.CrossRef
61.
go back to reference Juraschek SP, Blaha MJ, Whelton SP, Blumenthal R, Jones SR, Keteyian SJ, et al. Physical fitness and hypertension in a population at risk for cardiovascular disease: the Henry ford exercise testing (FIT) project. J Am Heart Assoc. 2014;3(6):e001268.PubMedPubMedCentralCrossRef Juraschek SP, Blaha MJ, Whelton SP, Blumenthal R, Jones SR, Keteyian SJ, et al. Physical fitness and hypertension in a population at risk for cardiovascular disease: the Henry ford exercise testing (FIT) project. J Am Heart Assoc. 2014;3(6):e001268.PubMedPubMedCentralCrossRef
62.
go back to reference Ergul A. Hypertension in black patients: an emerging role of the endothelin system in salt-sensitive hypertension. Hypertension. 2000;36(1):62–7.PubMedCrossRef Ergul A. Hypertension in black patients: an emerging role of the endothelin system in salt-sensitive hypertension. Hypertension. 2000;36(1):62–7.PubMedCrossRef
63.
go back to reference Zanettini JO, Fuchs FD, Zanettini MT, Zanettini JP. Is hypertensive response in treadmill testing better identified with correction for working capacity? A study with clinical, echocardiographic and ambulatory blood pressure correlates. Blood Press. 2004;13(4):225–9.PubMedCrossRef Zanettini JO, Fuchs FD, Zanettini MT, Zanettini JP. Is hypertensive response in treadmill testing better identified with correction for working capacity? A study with clinical, echocardiographic and ambulatory blood pressure correlates. Blood Press. 2004;13(4):225–9.PubMedCrossRef
Metadata
Title
On the interpretability of machine learning-based model for predicting hypertension
Authors
Radwa Elshawi
Mouaz H. Al-Mallah
Sherif Sakr
Publication date
01-12-2019
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2019
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/s12911-019-0874-0

Other articles of this Issue 1/2019

BMC Medical Informatics and Decision Making 1/2019 Go to the issue