Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2018

Open Access 01-12-2018 | Technical advance

Towards stroke prediction using electronic health records

Author: Douglas Teoh

Published in: BMC Medical Informatics and Decision Making | Issue 1/2018

Login to get access

Abstract

Background

As of 2014, stroke is the fourth leading cause of death in Japan. Predicting a future diagnosis of stroke would better enable proactive forms of healthcare measures to be taken. We aim to predict a diagnosis of stroke within one year of the patient’s last set of exam results or medical diagnoses.

Methods

Around 8000 electronic health records were provided by Tsuyama Jifukai Tsuyama Chuo Hospital in Japan. These records contained non-homogeneous temporal data which were first transformed into a form usable by an algorithm. The transformed data were used as input into several neural network architectures designed to evaluate efficacy of the supplied data and also the networks’ capability at exploiting relationships that could underlie the data. The prevalence of stroke cases resulted in imbalanced class outputs which resulted in trained neural network models being biased towards negative predictions. To address this issue, we designed and incorporated regularization terms into the standard cross-entropy loss function. These terms penalized false positive and false negative predictions. We evaluated the performance of our trained models using Receiver Operating Characteristic.

Results

The best neural network incorporated and combined the different sources of temporal data through a dual-input topology. This network attained area under the Receiver Operating Characteristic curve of 0.669. The custom regularization terms had a positive effect on the training process when compared against the standard cross-entropy loss function.

Conclusions

The techniques we describe in this paper are viable and the developed models form part of the foundation of a national clinical decision support system.
Literature
1.
go back to reference Statistics Bureau. Japan statistical yearbook 2017. Technical report, Ministry of Internal Affairs and Communications. 2017. Statistics Bureau. Japan statistical yearbook 2017. Technical report, Ministry of Internal Affairs and Communications. 2017.
2.
go back to reference Turin TC, Kokubo Y, Murakami Y, Higashiyama A, Rumana N, Watanabe M, Okamura T. Lifetime risk of stroke in japan. Stroke. 2010; 41(7):1552–4.CrossRefPubMed Turin TC, Kokubo Y, Murakami Y, Higashiyama A, Rumana N, Watanabe M, Okamura T. Lifetime risk of stroke in japan. Stroke. 2010; 41(7):1552–4.CrossRefPubMed
3.
go back to reference Wolf PA, D’agostino RB, Belanger AJ, Kannel WB. Probability of stroke: a risk profile from the framingham study. Stroke. 1991; 22(3):312–8.CrossRefPubMed Wolf PA, D’agostino RB, Belanger AJ, Kannel WB. Probability of stroke: a risk profile from the framingham study. Stroke. 1991; 22(3):312–8.CrossRefPubMed
4.
go back to reference Jee SH, Park JW, Lee S-Y, Nam B-H, Ryu HG, Kim SY, Kim YN, Lee JK, Choi SM, Yun JE. Stroke risk prediction model: a risk profile from the korean study. Atherosclerosis. 2008; 197(1):318–25.CrossRefPubMed Jee SH, Park JW, Lee S-Y, Nam B-H, Ryu HG, Kim SY, Kim YN, Lee JK, Choi SM, Yun JE. Stroke risk prediction model: a risk profile from the korean study. Atherosclerosis. 2008; 197(1):318–25.CrossRefPubMed
5.
go back to reference Chien K-L, Su T-C, Hsu H-C, Chang W-T, Chen P-C, Sung F-C, Chen M-F, Lee Y-T. Constructing the prediction model for the risk of stroke in a chinese population. Stroke. 2010; 41(9):1858–64.CrossRefPubMed Chien K-L, Su T-C, Hsu H-C, Chang W-T, Chen P-C, Sung F-C, Chen M-F, Lee Y-T. Constructing the prediction model for the risk of stroke in a chinese population. Stroke. 2010; 41(9):1858–64.CrossRefPubMed
6.
go back to reference Hajifathalian K, Ueda P, Lu Y, Woodward M, Ahmadvand A, Aguilar-Salinas CA, Azizi F, Cifkova R, Di Cesare M, Eriksen L, Farzadfar F, Ikeda N, Khalili D, Khang Y-H, Lanska V, León-Muñoz L, Magliano D, Msyamboza KP, Oh K, Rodríguez-Artalejo F, Rojas-Martinez R, Shaw JE, Stevens GA, Tolstrup J, Zhou B, Salomon JA, Ezzati M, Danaei G. A novel risk score to predict cardiovascular disease risk in national populations (globorisk): a pooled analysis of prospective cohorts and health examination surveys. Lancet Diabetes Endocrinol. 2015; 3(5):339–55.CrossRefPubMed Hajifathalian K, Ueda P, Lu Y, Woodward M, Ahmadvand A, Aguilar-Salinas CA, Azizi F, Cifkova R, Di Cesare M, Eriksen L, Farzadfar F, Ikeda N, Khalili D, Khang Y-H, Lanska V, León-Muñoz L, Magliano D, Msyamboza KP, Oh K, Rodríguez-Artalejo F, Rojas-Martinez R, Shaw JE, Stevens GA, Tolstrup J, Zhou B, Salomon JA, Ezzati M, Danaei G. A novel risk score to predict cardiovascular disease risk in national populations (globorisk): a pooled analysis of prospective cohorts and health examination surveys. Lancet Diabetes Endocrinol. 2015; 3(5):339–55.CrossRefPubMed
7.
go back to reference Letham B, Rudin C, McCormick TH, Madigan D, et al. Interpretable classifiers using rules and bayesian analysis: Building a better stroke prediction model. Ann Appl Stat. 2015; 9(3):1350–71.CrossRef Letham B, Rudin C, McCormick TH, Madigan D, et al. Interpretable classifiers using rules and bayesian analysis: Building a better stroke prediction model. Ann Appl Stat. 2015; 9(3):1350–71.CrossRef
8.
go back to reference Khosla A, Cao Y, Lin CC-Y, Chiu H-K, Hu J, Lee H. An integrated machine learning approach to stroke prediction. In: Proceeding KDD ’10 Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. New York: ACM: 2010. p. 183–92. https://doi.org/10.1145/1835804.1835830. Khosla A, Cao Y, Lin CC-Y, Chiu H-K, Hu J, Lee H. An integrated machine learning approach to stroke prediction. In: Proceeding KDD ’10 Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. New York: ACM: 2010. p. 183–92. https://​doi.​org/​10.​1145/​1835804.​1835830.
9.
go back to reference Warner HR, Toronto AF, Veasey LG, Stephenson R. A mathematical approach to medical diagnosis: application to congenital heart disease. Jama. 1961; 177(3):177–83.CrossRefPubMed Warner HR, Toronto AF, Veasey LG, Stephenson R. A mathematical approach to medical diagnosis: application to congenital heart disease. Jama. 1961; 177(3):177–83.CrossRefPubMed
10.
go back to reference Saito K, Nakano R. Medical diagnostic expert system based on pdp model. In: Proceedings of IEEE International Conference on Neural Networks. vol. 1.1988. p. 255–62. Saito K, Nakano R. Medical diagnostic expert system based on pdp model. In: Proceedings of IEEE International Conference on Neural Networks. vol. 1.1988. p. 255–62.
11.
go back to reference Choi E, Bahadori MT, Schuetz A, Stewart WF, Sun J. Doctor AI: Predicting clinical events via recurrent neural networks. In: Proceedings of the 1st Machine Learning for Healthcare Conference. Proceedings of Machine Learning Research: 2016. p. 301–18. Choi E, Bahadori MT, Schuetz A, Stewart WF, Sun J. Doctor AI: Predicting clinical events via recurrent neural networks. In: Proceedings of the 1st Machine Learning for Healthcare Conference. Proceedings of Machine Learning Research: 2016. p. 301–18.
14.
go back to reference Nguyen P, Tran T, Wickramasinghe N, Venkatesh S. Deepr: A convolutional net for medical records. IEEE J Biomed Health Inform. 2017; 21(1):22–30.CrossRefPubMed Nguyen P, Tran T, Wickramasinghe N, Venkatesh S. Deepr: A convolutional net for medical records. IEEE J Biomed Health Inform. 2017; 21(1):22–30.CrossRefPubMed
24.
go back to reference Eban E, Schain M, Mackey A, Gordon A, Rifkin R, Elidan G. Scalable learning of non-decomposable objectives. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research: 2017. p. 832–40. Eban E, Schain M, Mackey A, Gordon A, Rifkin R, Elidan G. Scalable learning of non-decomposable objectives. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research: 2017. p. 832–40.
25.
go back to reference Fawcett T. An introduction to roc analysis. Pattern Recogn Lett. 2006; 27(8):861–74.CrossRef Fawcett T. An introduction to roc analysis. Pattern Recogn Lett. 2006; 27(8):861–74.CrossRef
26.
go back to reference Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez J-C, Müller M. proc: an open-source package for r and s+ to analyze and compare roc curves. BMC Bioinforma. 2011; 12:77.CrossRef Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez J-C, Müller M. proc: an open-source package for r and s+ to analyze and compare roc curves. BMC Bioinforma. 2011; 12:77.CrossRef
27.
go back to reference Kimberly WT, Wu O, Arsava EM, Garg P, Ji R, Vangel M, Singhal AB, Ay H, Sorensen AG. Lower hemoglobin correlates with larger stroke volumes in acute ischemic stroke. Cerebrovasc Dis Extra. 2011; 1(1):44–53.CrossRefPubMedPubMedCentral Kimberly WT, Wu O, Arsava EM, Garg P, Ji R, Vangel M, Singhal AB, Ay H, Sorensen AG. Lower hemoglobin correlates with larger stroke volumes in acute ischemic stroke. Cerebrovasc Dis Extra. 2011; 1(1):44–53.CrossRefPubMedPubMedCentral
29.
go back to reference Ani C, Ovbiagele B. Elevated red blood cell distribution width predicts mortality in persons with known stroke. J Neurol Sci. 2009; 277(1):103–8.CrossRefPubMed Ani C, Ovbiagele B. Elevated red blood cell distribution width predicts mortality in persons with known stroke. J Neurol Sci. 2009; 277(1):103–8.CrossRefPubMed
30.
go back to reference Mayda-Domaç F, Mısırlı H, Yılmaz M. Prognostic role of mean platelet volume and platelet count in ischemic and hemorrhagic stroke. J Stroke Cerebrovasc Dis. 2010; 19(1):66–72.CrossRefPubMed Mayda-Domaç F, Mısırlı H, Yılmaz M. Prognostic role of mean platelet volume and platelet count in ischemic and hemorrhagic stroke. J Stroke Cerebrovasc Dis. 2010; 19(1):66–72.CrossRefPubMed
31.
go back to reference Selvin E, Steffes MW, Zhu H, Matsushita K, Wagenknecht L, Pankow J, Coresh J, Brancati FL. Glycated hemoglobin, diabetes, and cardiovascular risk in nondiabetic adults. N Engl J Med. 2010; 362(9):800–11.CrossRefPubMedPubMedCentral Selvin E, Steffes MW, Zhu H, Matsushita K, Wagenknecht L, Pankow J, Coresh J, Brancati FL. Glycated hemoglobin, diabetes, and cardiovascular risk in nondiabetic adults. N Engl J Med. 2010; 362(9):800–11.CrossRefPubMedPubMedCentral
32.
go back to reference Nomani AZ, Nabi S, Ahmed S, Iqbal M, Rajput HM, Rao S. High hba1c is associated with higher risk of ischaemic stroke in pakistani population without diabetes. Stroke Vasc Neurol. 2016; 1(3):133–9.CrossRefPubMedPubMedCentral Nomani AZ, Nabi S, Ahmed S, Iqbal M, Rajput HM, Rao S. High hba1c is associated with higher risk of ischaemic stroke in pakistani population without diabetes. Stroke Vasc Neurol. 2016; 1(3):133–9.CrossRefPubMedPubMedCentral
Metadata
Title
Towards stroke prediction using electronic health records
Author
Douglas Teoh
Publication date
01-12-2018
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2018
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/s12911-018-0702-y

Other articles of this Issue 1/2018

BMC Medical Informatics and Decision Making 1/2018 Go to the issue