Top

BMC Medical Informatics and Decision Making

Published in:

Open Access 01-12-2019 | Hypertension | Research article

On the interpretability of machine learning-based model for predicting hypertension

Authors: Radwa Elshawi, Mouaz H. Al-Mallah, Sherif Sakr

Published in: BMC Medical Informatics and Decision Making | Issue 1/2019

Abstract

Background

Although complex machine learning models are commonly outperforming the traditional simple interpretable models, clinicians find it hard to understand and trust these complex models due to the lack of intuition and explanation of their predictions. The aim of this study to demonstrate the utility of various model-agnostic explanation techniques of machine learning models with a case study for analyzing the outcomes of the machine learning random forest model for predicting the individuals at risk of developing hypertension based on cardiorespiratory fitness data.

Methods

The dataset used in this study contains information of 23,095 patients who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems between 1991 and 2009 and had a complete 10-year follow-up. Five global interpretability techniques (Feature Importance, Partial Dependence Plot, Individual Conditional Expectation, Feature Interaction, Global Surrogate Models) and two local interpretability techniques (Local Surrogate Models, Shapley Value) have been applied to present the role of the interpretability techniques on assisting the clinical staff to get better understanding and more trust of the outcomes of the machine learning-based predictions.

Results

Several experiments have been conducted and reported. The results show that different interpretability techniques can shed light on different insights on the model behavior where global interpretations can enable clinicians to understand the entire conditional distribution modeled by the trained response function. In contrast, local interpretations promote the understanding of small parts of the conditional distribution for specific instances.

Conclusions

Various interpretability techniques can vary in their explanations for the behavior of the machine learning model. The global interpretability techniques have the advantage that it can generalize over the entire population while local interpretability techniques focus on giving explanations at the level of instances. Both methods can be equally valid depending on the application need. Both methods are effective methods for assisting clinicians on the medical decision process, however, the clinicians will always remain to hold the final say on accepting or rejecting the outcome of the machine learning models and their explanations based on their domain expertise.

Caruana R, Lou Y, Gehrke J, Koch P, Sturm M, Elhadad N. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining: ACM; 2015. p. 1721–30. https://dl.acm.org/citation.cfm?id=2788613.

Abdul A, Vermeulen J, Wang D, Lim BY, Kankanhalli M. Trends and trajectories for explainable, accountable and intelligible systems: An hci research agenda. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems: ACM; 2018. p. 582. https://dl.acm.org/citation.cfm?id=3174156.

Lim BY, Dey AK, Avrahami D. Why and why not explanations improve the intelligibility of context-aware intelligent systems. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: ACM; 2009. p. 2119–28. https://dl.acm.org/citation.cfm?id=1519023.

Kononenko I. Machine learning for medical diagnosis: history, state of the art and perspective. Artif Intell Med. 2001;23(1):89–109.PubMedCrossRef

Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920–30.PubMedPubMedCentralCrossRef

Obermeyer Z, Emanuel EJ. Predicting the future - big data, machine learning, and clinical medicine. N Engl J Med. 2016;375(13):1216.PubMedPubMedCentralCrossRef

Darcy AM, Louie AK, Roberts LW. Machine learning and the profession of medicine. Jama. 2016;315(6):551–2.PubMedCrossRef

Singal AG, Rahimi RS, Clark C, Ma Y, Cuthbert JA, Rockey DC, et al. An automated model using electronic medical record data identifies patients with cirrhosis at high risk for readmission. Clin Gastroenterol Hepatol. 2013;11(10):1335–41.PubMedCrossRef

He D, Mathews SC, Kalloo AN, Hutfless S. Mining high-dimensional administrative claims data to predict early hospital readmissions. J Am Med Inform Assoc. 2014;21(2):272–9.PubMedCrossRef

10.

Pederson JL, Majumdar SR, Forhan M, Johnson JA, McAlister FA, Investigators P, et al. Current depressive symptoms but not history of depression predict hospital readmission or death after discharge from medical wards: a multisite prospective cohort study. Gen Hosp Psychiatry. 2016;39:80–5.PubMedCrossRef

11.

Basu Roy S, Teredesai A, Zolfaghar K, Liu R, Hazel D, Newman S, et al. Dynamic hierarchical classification for patient risk-of-readmission. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining: ACM; 2015. p. 1691–700. https://dl.acm.org/citation.cfm?id=2788585.

12.

Futoma J, Morris J, Lucas J. A comparison of models for predicting early hospital readmissions. J Biomed Inform. 2015;56:229–38.PubMedCrossRef

13.

Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B. Support vector machines. IEEE Intell Syst Appl. 1998;13(4):18–28.CrossRef

14.

Liaw A, Wiener M, et al. Classification and regression by random Forest. R News. 2002;2(3):18–22.

15.

Specht DF. Probabilistic neural networks. Neural Netw. 1990;3(1):109–18.CrossRef

16.

Chang CD, Wang CC, Jiang BC. Using data mining techniques for multi-diseases prediction modeling of hypertension and hyperlipidemia by common risk factors. Expert Syst Appl. 2011;38(5):5507–13.CrossRef

17.

Abdullah AA, Zakaria Z, Mohamad NF. Design and development of fuzzy expertsystem for diagnosis of hypertension. In: Intelligent Systems, Modelling and Simulation (ISMS), 2011 Second International Conference on. IEEE; 2011. p. 113–7.CrossRef

18.

Farran B, Channanath AM, Behbehani K, Thanaraj TA. Predictive models to assess risk of type 2 diabetes, hypertension and comorbidity: machine-learning algorithms and validation using national health data from Kuwait - a cohort study. BMJ Open. 2013;3(5):e002457.PubMedPubMedCentralCrossRef

19.

Chen JH, Asch SM. Machine learning and prediction in medicine - beyond the peak of inflated expectations. N Engl J Med. 2017;376(26):2507.PubMedPubMedCentralCrossRef

20.

Goodman B, Flaxman S. European Union regulations on algorithmic decision making and a “right to explanation”. arXiv preprint arXiv:160608813. 2016;.

21.

Sakr S, Elshawi R, Ahmed A, Qureshi WT, Brawner C, Keteyian S, et al. Using machine learning on cardiorespiratory fitness data for predicting hypertension: the Henry ford ExercIse testing (FIT) project. PLoS One. 2018;13(4):e0195344.PubMedPubMedCentralCrossRef

22.

Neter J, Wasserman W, Kutner MH. Applied linear regression models; 1989.

23.

Jalali A, Licht DJ, Nataraj C. Application of Decision Tree in the Prediction of Periventricular Leukomalacia (PVL) Occurrence in Neonates After Neonatal Heart Surgery. In: Conference proceedings:... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Conference, vol. 2012. NIH Public Access; 2012. p. 5931.

24.

Che D, Liu Q, Rasheed K, Tao X. Decision tree and ensemble learning algorithms with their applications in bioinformatics. In: Software tools and algorithms for biological systems: Springer; 2011. p. 191–9. https://link.springer.com/chapter/10.1007/978-1-4419-7046-6_19.

25.

Brindle JM, Trindade AA, Shah AP, Jokisch DW, Patton PW, Pichardo JC, et al. Linear regression model for predicting patient-specific total skeletal spongiosa volume for use in molecular radiotherapy dosimetry. J Nucl Med. 2006;47(11):1875.PubMed

26.

Friedman J, Hastie T, Tibshirani R. The elements of statistical learning, vol. 1. New York: Springer series in statistics; 2001.

27.

Ribeiro MT, Singh S, Guestrin C. Model-agnostic interpretability of machine learning. In: ICML Workshop on Human Interpretability in Machine Learning (WHI); 2016.

28.

Kim B, Glassman E, Johnson B, Shah J. iBCM: Interactive Bayesian Case Model Empowering Humans via Intuitive Interaction; 2015.

29.

Freitas AA. Comprehensible classification models – a position paper. ACM SIGKDD Explor Newslett. 2014;15(1):1–10.CrossRef

30.

Zilke JR, Mencıa EL, Janssen F. Deepred–rule extraction from deep neural networks. In: International Conference on Discovery Science: Springer; 2016. p. 457–73. https://link.springer.com/chapter/10.1007/978-3-319-46307-0_29.

31.

Sato M, Tsukimoto H. Rule extraction from neural networks via decision tree induction. In: Neural Networks, 2001. Proceedings. IJCNN’01. International Joint Conference on, vol. 3. IEEE; 2001. p. 1870–5.

32.

Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D. A survey of methods for explaining black box models. ACM Comput Surv. 2018;51(5):93.CrossRef

33.

Koh, P.W., Liang, P.: Understanding black-box predictions via influence functions. arXiv preprint arXiv:1703.04730 (2017).

34.

Ribeiro MT, Singh S, Guestrin C. Anchors: high-precision model-agnostic explanations. In: AAAI Conference on Artificial Intelligence; 2018.

35.

Towell GG, Shavlik JW. Extracting refined rules from knowledge-based neural networks. Mach Learn. 1993;13(1):71–101.

36.

Setiono R, Leow WK. Fernn: an algorithm for fast extraction of rules from neural networks. Appl Intell. 2000;12(1–2):15–25.CrossRef

37.

Zeiler MD, Fergus R. Visualizing and understanding convolutional networks. In: European conference on computer vision: Springer; 2014. p. 818–33. https://link.springer.com/chapter/10.1007/978-3-319-10590-1_53.

38.

K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep inside convolutional networks: Visualising image classification models and saliency maps,” arXiv preprint arXiv:1312.6034, 2013.

39.

Bach S, Binder A, Montavon G, Klauschen F, Müller KR, Samek W. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one. 2015;10(7):e0130140.PubMedPubMedCentralCrossRef

40.

A. Shrikumar, P. Greenside, A. Kundaje, “Learning important features through propagating activation differences,” arXiv preprint arXiv:1704.02685, 2017.

41.

Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A. Learning deep features for discriminative localization. In: Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on. IEEE; 2016. p. 2921–9.CrossRef

42.

Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556. 2014;.

43.

Yosinski J, Clune J, Nguyen A, Fuchs T, Lipson H. Understanding neural networks through deep visualization. arXiv preprint arXiv:150606579. 2015;.

44.

Karpathy A, Johnson J, Fei-Fei L. Visualizing and understanding recurrent networks. arXiv preprint arXiv:150602078. 2015.

45.

Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.CrossRef

46.

Andrews R, Diederich J, Tickle AB. Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowl-Based Syst. 1995;8(6):373–89.CrossRef

47.

Craven MW, Shavlik JW. Using sampling and queries to extract rules from trained neural networks. In: Machine Learning Proceedings 1994: Elsevier; 1994. p. 37–45. https://www.sciencedirect.com/science/article/pii/B9781558603356500131.

48.

Poulet F. Svm and graphical algorithms: A cooperative approach. In: Data Mining, 2004. ICDM’04. Fourth IEEE International Conference on. IEEE; 2004. p. 499–502.

49.

Strumbelj E, Bosnic Z, Kononenko I, Zakotnik B, Kuhar CG. Explanation and reliability of prediction models: the case of breast cancer recurrence. Knowl Inf Syst. 2010;24(2):305–24.CrossRef

50.

Caruana R, Kangarloo H, Dionisio J, Sinha U, Johnson D. Case-based explanation of non-case-based learning methods. In: Proceedings of the AMIA Symposium: American Medical Informatics Association; 1999. p. 212. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2232607/.

51.

Cover T, Hart P. Nearest neighbor pattern classification. IEEE Trans Inf Theory. 1967;13(1):21–7.CrossRef

52.

Al-Mallah MH, Keteyian SJ, Brawner CA, Whelton S, Blaha MJ. Rationale and design of the Henry ford exercise testing project (the FIT project). Clin Cardiol. 2014;37(8):456–61.PubMedPubMedCentralCrossRef

53.

Fisher A, Rudin C, Dominici F. Model Class Reliance: Variable Importance Measures for any Machine Learning Model Class, from the “Rashomon” Perspective. 2018;arXiv preprint arXiv:180101489

54.

Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. 2001:1189–232. https://www.jstor.org/stable/2699986?seq=1#page_scan_tab_contents.CrossRef

55.

Goldstein A, Kapelner A, Bleich J, Pitkin E. Peeking inside the black box: visualizing statistical learning with plots of individual conditional expectation. J Comput Graph Stat. 2015;24(1):44–65.CrossRef

56.

Friedman JH, Popescu BE, et al. Predictive learning via rule ensembles. Ann Appl Stat. 2008;2(3):916–54.CrossRef

57.

Strumbelj E, Kononenko I. Explaining prediction models and individual predictions with feature contributions. Knowl Inf Syst. 2014;41(3):647–65.CrossRef

58.

Ribeiro MT, Singh S, Guestrin C. Why should i trust you?: Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining: ACM; 2016. p. 1135–44. https://dl.acm.org/citation.cfm?id=2939778.

59.

Shapley LS, Roth AE, et al. The Shapley value: essays in honor of Lloyd S. Shapley: Cambridge University Press; 1988. https://www.amazon.com/Shapley-Value-Roth/dp/0521021332.

60.

Rockwood MR, Howlett SE. Blood pressure in relation to age and frailty. Can Geriatr J. 2011;14(1):2.PubMedPubMedCentralCrossRef

61.

Juraschek SP, Blaha MJ, Whelton SP, Blumenthal R, Jones SR, Keteyian SJ, et al. Physical fitness and hypertension in a population at risk for cardiovascular disease: the Henry ford exercise testing (FIT) project. J Am Heart Assoc. 2014;3(6):e001268.PubMedPubMedCentralCrossRef

62.

Ergul A. Hypertension in black patients: an emerging role of the endothelin system in salt-sensitive hypertension. Hypertension. 2000;36(1):62–7.PubMedCrossRef

63.

Zanettini JO, Fuchs FD, Zanettini MT, Zanettini JP. Is hypertensive response in treadmill testing better identified with correction for working capacity? A study with clinical, echocardiographic and ambulatory blood pressure correlates. Blood Press. 2004;13(4):225–9.PubMedCrossRef

Title: On the interpretability of machine learning-based model for predicting hypertension
Authors: Radwa Elshawi
Mouaz H. Al-Mallah
Sherif Sakr
Publication date: 01-12-2019
Publisher: BioMed Central
Keywords: Hypertension
Hypertension
Published in: BMC Medical Informatics and Decision Making / Issue 1/2019
Electronic ISSN: 1472-6947
DOI: https://doi.org/10.1186/s12911-019-0874-0

At a glance: The ONWARDS insulin icodec trials

Springer Medicine

On the interpretability of machine learning-based model for predicting hypertension

Abstract

Background

Methods

Results

Conclusions

At a glance: The ONWARDS insulin icodec trials

Springer Medicine

Abstract

Background

Methods

Results

Conclusions

Please log in to get access to this content

Other articles of this Issue 1/2019

Correction to: ThalPred: a web-based prediction tool for discriminating thalassemia trait and iron deficiency anemia

Initial development of Supportive care Assessment, Prioritization and Recommendations for Kids (SPARK), a symptom screening and management application

Patient centred variables with univariate associations with unplanned ICU admission: a systematic review

An online experiment to assess bias in professional medical coding

Expenditure variations analysis using residuals for identifying high health care utilizers in a state Medicaid program

A clinical text classification paradigm using weak supervision and deep representation