Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2019

Open Access 01-12-2019 | Care | Research article

Predicting life expectancy with a long short-term memory recurrent neural network using electronic medical records

Authors: Merijn Beeksma, Suzan Verberne, Antal van den Bosch, Enny Das, Iris Hendrickx, Stef Groenewoud

Published in: BMC Medical Informatics and Decision Making | Issue 1/2019

Login to get access

Abstract

Background

Life expectancy is one of the most important factors in end-of-life decision making. Good prognostication for example helps to determine the course of treatment and helps to anticipate the procurement of health care services and facilities, or more broadly: facilitates Advance Care Planning. Advance Care Planning improves the quality of the final phase of life by stimulating doctors to explore the preferences for end-of-life care with their patients, and people close to the patients. Physicians, however, tend to overestimate life expectancy, and miss the window of opportunity to initiate Advance Care Planning. This research tests the potential of using machine learning and natural language processing techniques for predicting life expectancy from electronic medical records.

Methods

We approached the task of predicting life expectancy as a supervised machine learning task. We trained and tested a long short-term memory recurrent neural network on the medical records of deceased patients. We developed the model with a ten-fold cross-validation procedure, and evaluated its performance on a held-out set of test data. We compared the performance of a model which does not use text features (baseline model) to the performance of a model which uses features extracted from the free texts of the medical records (keyword model), and to doctors’ performance on a similar task as described in scientific literature.

Results

Both doctors and the baseline model were correct in 20% of the cases, taking a margin of 33% around the actual life expectancy as the target. The keyword model, in comparison, attained an accuracy of 29% with its prognoses. While doctors overestimated life expectancy in 63% of the incorrect prognoses, which harms anticipation to appropriate end-of-life care, the keyword model overestimated life expectancy in only 31% of the incorrect prognoses.

Conclusions

Prognostication of life expectancy is difficult for humans. Our research shows that machine learning and natural language processing techniques offer a feasible and promising approach to predicting life expectancy. The research has potential for real-life applications, such as supporting timely recognition of the right moment to start Advance Care Planning.
Footnotes
1
Due to the skewed distribution of the data (7% prevalence), the authors prefer to discuss their results in terms of precision and recall, rather than sensitivity and specificity, because it provides more information about the algorithm’s performance ([34]:5).
 
Literature
1.
go back to reference Brinkman-Stoppelenburg A, van der Heide A. The effects of advance care planning on end-of-life care: a systematic review. Palliat Med. 2014;28:1000–25.CrossRef Brinkman-Stoppelenburg A, van der Heide A. The effects of advance care planning on end-of-life care: a systematic review. Palliat Med. 2014;28:1000–25.CrossRef
2.
go back to reference Billings JA, Bernacki R. Strategic targeting of advance care planning interventions - the goldilocks phenomenon. JAMA Intern Med. 2014;174:620–4.CrossRef Billings JA, Bernacki R. Strategic targeting of advance care planning interventions - the goldilocks phenomenon. JAMA Intern Med. 2014;174:620–4.CrossRef
3.
go back to reference Weeks JC, Cook F, O’Day S, Peterson LM, Wenger N, Reding D, et al. Relationship between Cancer patients’ predictions of prognosis and their treatment preferences. J Am Med Assoc. 1998;279:1709–14.CrossRef Weeks JC, Cook F, O’Day S, Peterson LM, Wenger N, Reding D, et al. Relationship between Cancer patients’ predictions of prognosis and their treatment preferences. J Am Med Assoc. 1998;279:1709–14.CrossRef
4.
go back to reference Frankl D, Oye RK, Bellamy PE. Attitudes of hospitalized patients toward life support: a survey of 200 medical inpatients. Am J Med. 1989;86:645–8.CrossRef Frankl D, Oye RK, Bellamy PE. Attitudes of hospitalized patients toward life support: a survey of 200 medical inpatients. Am J Med. 1989;86:645–8.CrossRef
5.
go back to reference Celi LA, Marshall JD, Lai Y, Stone DJ. Disrupting electronic health records systems: The next generation. JMIR Med Inform 2015;3(4):e34. Celi LA, Marshall JD, Lai Y, Stone DJ. Disrupting electronic health records systems: The next generation. JMIR Med Inform 2015;3(4):e34.
7.
go back to reference Marlin BM, Kale DC, Khemani RG, Wetzel RC. Unsupervised pattern discovery in electronic health care data using probabilistic clustering models. Proc 2nd ACM SIGHIT Int Heal Informatics Symp. 2012;28:389–98.CrossRef Marlin BM, Kale DC, Khemani RG, Wetzel RC. Unsupervised pattern discovery in electronic health care data using probabilistic clustering models. Proc 2nd ACM SIGHIT Int Heal Informatics Symp. 2012;28:389–98.CrossRef
8.
go back to reference Cios KJ, Moore WG. Uniqueness of medical data mining. Artif Intell Med. 2002;26:1–24.CrossRef Cios KJ, Moore WG. Uniqueness of medical data mining. Artif Intell Med. 2002;26:1–24.CrossRef
9.
go back to reference Thoonsen B, Engels Y, Van Rijswijk E, Verhagen S, Van Weel C, Groot M, et al. Early identification of palliative care patients in general practice: development of RADboud indicators for PAlliative care needs. Br J Gen Pract. 2012;62:625–31.CrossRef Thoonsen B, Engels Y, Van Rijswijk E, Verhagen S, Van Weel C, Groot M, et al. Early identification of palliative care patients in general practice: development of RADboud indicators for PAlliative care needs. Br J Gen Pract. 2012;62:625–31.CrossRef
10.
go back to reference Highet G, Crawford D, Murray SA, Boyd K. Development and evaluation of the Supportive and Palliative Care Indicators Tool (SPICT): a mixed-methods study. BMJ Support Palliat Care. 2014;4(3):285–90. Highet G, Crawford D, Murray SA, Boyd K. Development and evaluation of the Supportive and Palliative Care Indicators Tool (SPICT): a mixed-methods study. BMJ Support Palliat Care. 2014;4(3):285–90.
11.
go back to reference Moss AH, Ganjoo J, Sharma S, Gansor J, Senft S, Weaner B, et al. Utility of the “surprise” question to identify Dialysis patients with high mortality. Clin J Am Soc Nephrol. 2008;3:1379–84.CrossRef Moss AH, Ganjoo J, Sharma S, Gansor J, Senft S, Weaner B, et al. Utility of the “surprise” question to identify Dialysis patients with high mortality. Clin J Am Soc Nephrol. 2008;3:1379–84.CrossRef
12.
go back to reference Moss AH, Lunney JR, Culb S, Auber M, Kurian S, Rogers J, et al. Prognostic significance of the “surprise” question in Cancer patients. J Palliat Med. 2010;13:837–40.CrossRef Moss AH, Lunney JR, Culb S, Auber M, Kurian S, Rogers J, et al. Prognostic significance of the “surprise” question in Cancer patients. J Palliat Med. 2010;13:837–40.CrossRef
13.
go back to reference Maas EAT, Murray SA, Engels Y, Campbell C. What tools are available to identify patients with palliative care needs in primary care: a systematic literature review and survey of European practice. BMJ Support Palliat Care. 2013;3:444–51.CrossRef Maas EAT, Murray SA, Engels Y, Campbell C. What tools are available to identify patients with palliative care needs in primary care: a systematic literature review and survey of European practice. BMJ Support Palliat Care. 2013;3:444–51.CrossRef
14.
go back to reference Claessen SJJ, Francke AL, Engels Y, Deliens L. How do GPs identify a need for palliative care in their patients? An interview study. BMC Fam Pract. 2013;14. Claessen SJJ, Francke AL, Engels Y, Deliens L. How do GPs identify a need for palliative care in their patients? An interview study. BMC Fam Pract. 2013;14.
15.
go back to reference Christakis NA, Lamont EB. Extent and determinants of error in doctors’ prognoses in terminally ill patients: prospective cohort study. BMJ. 2000;320:469–73.CrossRef Christakis NA, Lamont EB. Extent and determinants of error in doctors’ prognoses in terminally ill patients: prospective cohort study. BMJ. 2000;320:469–73.CrossRef
16.
go back to reference White N, Reid F, Harris A, Harries P, Stone P. A systematic review of predictions of survival in palliative care: how accurate are clinicians and who are the experts? PLoS One. 2016;11:1–20. White N, Reid F, Harris A, Harries P, Stone P. A systematic review of predictions of survival in palliative care: how accurate are clinicians and who are the experts? PLoS One. 2016;11:1–20.
17.
go back to reference Ministerie van Volksgezondheid, Welzijn en sport (Dutch ministry of public health). Informatiekaart Palliatief Terminale Zorg (information card palliative terminal care). 2015. Ministerie van Volksgezondheid, Welzijn en sport (Dutch ministry of public health). Informatiekaart Palliatief Terminale Zorg (information card palliative terminal care). 2015.
18.
go back to reference Walczak S. Artificial neural network medical decision support tool: predicting transfusion requirements of ER patients. IEEE Trans Inf Technol Biomed. 2005;9:468–74.CrossRef Walczak S. Artificial neural network medical decision support tool: predicting transfusion requirements of ER patients. IEEE Trans Inf Technol Biomed. 2005;9:468–74.CrossRef
19.
go back to reference Mazurowski MA, Habas PA, Zurada JM, Lo JY, Baker JA, Tourassi GD. Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance. Neural Netw. 2008;21:427–36.CrossRef Mazurowski MA, Habas PA, Zurada JM, Lo JY, Baker JA, Tourassi GD. Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance. Neural Netw. 2008;21:427–36.CrossRef
21.
go back to reference Khemphila A, Boonjing V. Heart disease classification using neural network and feature selection. IEEE 21st Int Conf Syst Eng. 2011:406–9. Khemphila A, Boonjing V. Heart disease classification using neural network and feature selection. IEEE 21st Int Conf Syst Eng. 2011:406–9.
22.
go back to reference Al-Shayea QK. Artificial neural networks in medical diagnosis. Int J Comput Sci Issues. 2011;8:150–4. Al-Shayea QK. Artificial neural networks in medical diagnosis. Int J Comput Sci Issues. 2011;8:150–4.
23.
go back to reference Hazan H, Hilu D, Manevitz L, Ramig LO, Sapir S. Early diagnosis of Parkinson’s disease via machine learning on speech data. IEEE 27th Conv Electr Electron Eng Isr. 2012;2012. Hazan H, Hilu D, Manevitz L, Ramig LO, Sapir S. Early diagnosis of Parkinson’s disease via machine learning on speech data. IEEE 27th Conv Electr Electron Eng Isr. 2012;2012.
24.
go back to reference Lipton ZC, Kale DC, Elkan C, Wetzel R. Learning to diagnose with LSTM recurrent neural networks. Int Conf Learn Represent. 2016:1–18. Lipton ZC, Kale DC, Elkan C, Wetzel R. Learning to diagnose with LSTM recurrent neural networks. Int Conf Learn Represent. 2016:1–18.
25.
go back to reference Khan J, Wei JS, Ringnér M, Saal LH, Ladanyi M, Westermann F, et al. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med. 2001;7:673–9.CrossRef Khan J, Wei JS, Ringnér M, Saal LH, Ladanyi M, Westermann F, et al. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med. 2001;7:673–9.CrossRef
26.
go back to reference Kordylewski H, Graupe D, Liu K. A novel large-memory neural network as an aid in medical diagnosis applications. IEEE Trans Inf Technol Biomed. 2001;5:202–9.CrossRef Kordylewski H, Graupe D, Liu K. A novel large-memory neural network as an aid in medical diagnosis applications. IEEE Trans Inf Technol Biomed. 2001;5:202–9.CrossRef
27.
go back to reference Thangarasu G, Dominic PDD. Prediction of hidden knowledge from clinical database using data mining techniques. IEEE Int Conf Comput Inf Sci. 2014. Thangarasu G, Dominic PDD. Prediction of hidden knowledge from clinical database using data mining techniques. IEEE Int Conf Comput Inf Sci. 2014.
28.
go back to reference Liu C, Sun H, Du N, Tan S, Fei H, Fan W, et al. Augmented LSTM Framework to Construct Medical Self-diagnosis Android. IEEE 16th Int Conf Data Min. 2016:251–60. Liu C, Sun H, Du N, Tan S, Fei H, Fan W, et al. Augmented LSTM Framework to Construct Medical Self-diagnosis Android. IEEE 16th Int Conf Data Min. 2016:251–60.
30.
go back to reference Ramesh BP, Belknap SM, Li Z, Frid N, West DP, Yu H. Automatically recognizing medication and adverse event information from Food and Drug Administration’s adverse event reporting system narratives. JMIR Med Informatics. 2014;2. https://doi.org/10.2196/medinform.3022. Ramesh BP, Belknap SM, Li Z, Frid N, West DP, Yu H. Automatically recognizing medication and adverse event information from Food and Drug Administration’s adverse event reporting system narratives. JMIR Med Informatics. 2014;2. https://​doi.​org/​10.​2196/​medinform.​3022.
31.
go back to reference Iyer SV, Harpaz R, Lependu P, Bauer-Mehren A, Shah NH. Mining clinical text for signals of adverse drug-drug interactions. J Am Med Informatics Assoc. 2014;21:353–62.CrossRef Iyer SV, Harpaz R, Lependu P, Bauer-Mehren A, Shah NH. Mining clinical text for signals of adverse drug-drug interactions. J Am Med Informatics Assoc. 2014;21:353–62.CrossRef
34.
go back to reference Avati A, Jung K, Harman S, Downing L, Ng A, Shah NH. Improving palliative care with deep learning. IEEE Int Conf Bioinforma Biomed. 2017;18(4). Avati A, Jung K, Harman S, Downing L, Ng A, Shah NH. Improving palliative care with deep learning. IEEE Int Conf Bioinforma Biomed. 2017;18(4).
36.
go back to reference Dietterich TG. Machine learning for sequential data: a review. Proc Jt IAPR Int Work Struct Syntactic Stat Pattern Recogn. 2002;2396:15–30. Dietterich TG. Machine learning for sequential data: a review. Proc Jt IAPR Int Work Struct Syntactic Stat Pattern Recogn. 2002;2396:15–30.
37.
go back to reference Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9:1735–80.CrossRef Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9:1735–80.CrossRef
38.
go back to reference Kim H-G, Jang G-J, Choi H-J, Kim M, Kim Y-W, Choi J. Medical examination data prediction using simple recurrent network and long short-term memory. Proc Sixth Int Conf Emerg Databases Technol Appl Theory. 2016:26–34. Kim H-G, Jang G-J, Choi H-J, Kim M, Kim Y-W, Choi J. Medical examination data prediction using simple recurrent network and long short-term memory. Proc Sixth Int Conf Emerg Databases Technol Appl Theory. 2016:26–34.
40.
go back to reference Jagannatha AN, Yu H. Bidirectional RNN for Medical Event Detection in Electronic Health Records. Proc 2016 Conf North Am chapter Assoc Comput Linguist Hum Lang Technol. 2016;2016:473–82. Jagannatha AN, Yu H. Bidirectional RNN for Medical Event Detection in Electronic Health Records. Proc 2016 Conf North Am chapter Assoc Comput Linguist Hum Lang Technol. 2016;2016:473–82.
41.
go back to reference Sadikin M, Fanany MI, Basaruddin T. A new data representation based on training data characteristics to extract drug name entity in medical text. Comput Intell Neurosci. 2016;2016. Sadikin M, Fanany MI, Basaruddin T. A new data representation based on training data characteristics to extract drug name entity in medical text. Comput Intell Neurosci. 2016;2016.
42.
go back to reference Sahu SK, Anand A. Drug-drug interaction extraction from biomedical text using long short term memory. Network. 2017;86. Sahu SK, Anand A. Drug-drug interaction extraction from biomedical text using long short term memory. Network. 2017;86.
46.
go back to reference World Health Organization. ICD-10: international statistical classification of diseases and related health problems: tenth revision. 2004. World Health Organization. ICD-10: international statistical classification of diseases and related health problems: tenth revision. 2004.
47.
go back to reference WONCA International Classification Committee. International classification of primary care (ICPC). 1987. WONCA International Classification Committee. International classification of primary care (ICPC). 1987.
48.
go back to reference Beeksma MT. Computer, how long have I got left? Predicting life expectancy with a long short-term memory to aid in early identification of the palliative phase. Nijmegen; 2017. Beeksma MT. Computer, how long have I got left? Predicting life expectancy with a long short-term memory to aid in early identification of the palliative phase. Nijmegen; 2017.
51.
go back to reference Kullback S, Leibler RA. On information and sufficiency. Ann Math Stat. 1951;22:79–86.CrossRef Kullback S, Leibler RA. On information and sufficiency. Ann Math Stat. 1951;22:79–86.CrossRef
52.
go back to reference Kenter T, Borisov A, de Rijke M. Siamese CBOW: Optimizing Word Embeddings for Sentence Representations. Proc 54th Annu Meet Assoc Comput Linguist. 2016:941–51. Kenter T, Borisov A, de Rijke M. Siamese CBOW: Optimizing Word Embeddings for Sentence Representations. Proc 54th Annu Meet Assoc Comput Linguist. 2016:941–51.
Metadata
Title
Predicting life expectancy with a long short-term memory recurrent neural network using electronic medical records
Authors
Merijn Beeksma
Suzan Verberne
Antal van den Bosch
Enny Das
Iris Hendrickx
Stef Groenewoud
Publication date
01-12-2019
Publisher
BioMed Central
Keyword
Care
Published in
BMC Medical Informatics and Decision Making / Issue 1/2019
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/s12911-019-0775-2

Other articles of this Issue 1/2019

BMC Medical Informatics and Decision Making 1/2019 Go to the issue