Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2020

Open Access 01-12-2020 | Acute Kidney Injury | Research article

Utilizing imbalanced electronic health records to predict acute kidney injury by ensemble learning and time series model

Authors: Yuan Wang, Yake Wei, Hao Yang, Jingwei Li, Yubo Zhou, Qin Wu

Published in: BMC Medical Informatics and Decision Making | Issue 1/2020

Login to get access

Abstract

Background

Acute Kidney Injury (AKI) is a shared complication among Intensive Care Unit (ICU), marked by high cost, high morbidity and high mortality. As the early prediction of AKI is critical for patients’ outcomes and data mining is such a powerful prediction tool, many AKI prediction models based on machine learning methods have been proposed. Our motivation is inspired by the fact that the incidence of AKI is a changing temporal sequence affected by the joint action of patients’ daily drug combinations and their physiological indexes. However, most existing models have not considered such a temporal correlation. Besides, due to great challenges caused by sparse, high-dimensional and highly imbalanced clinical data, it is hard to achieve ideal performance.

Methods

We develop a fast, simple and less-costly model based on an ensemble learning algorithm, named Ensemble Time Series Model (ETSM). Besides benefiting from vital signs and laboratory results as explicit indicators, ETSM explores the effect of drug combinations as possible implicit indicators for the AKI prediction. The model transforms temporal medication information into a multidimensional vector to consider and measure drug cumulative effects that may cause AKI.

Results

We compare ETSM with state-of-the-art models on ICUC and MIMIC III datasets. On the basis of the experimental results, our model obtains satisfactory performance (ICUC: AUC 24 hours ahead: 0.81, 48 hours ahead: 0.78; MIMIC III: AUC 24 hours ahead: 0.95, 48 hours ahead: 0.95). Meanwhile, we compare the effects of different sampling and feature generation methods on the model performance. In the ablation study, we validate that medication information improves model performance (24 hours ahead: AUC increased from 0.74 to 0.81). We also find that the model’s performance is closely related to the balanced level of the derivation dataset. The optimal ratio of major class size to minor class size for the model is found for AKI prediction.

Conclusions

ETSM is an effective method for the early prediction of AKI. The model verifies that AKI incidence is related to the clinical medication. In comparison with other prediction methods, ETSM provides comparable performance results and better interpretability.
Literature
1.
go back to reference Schetz M, Schneider A. Focus on acute kidney injury. Intensive Care Med. 2017; 43(9):1421–3.CrossRef Schetz M, Schneider A. Focus on acute kidney injury. Intensive Care Med. 2017; 43(9):1421–3.CrossRef
2.
go back to reference Uchino S, Kellum J, Bellomo R, Doig G, Morimatsu H, Morgera S, Schetz M, Tan I, Bouman C, Macedo E, et al. Acute renal failure in critically ill patients: a multinational, multicenter study. Jama. 2005; 294(7):813–8.CrossRef Uchino S, Kellum J, Bellomo R, Doig G, Morimatsu H, Morgera S, Schetz M, Tan I, Bouman C, Macedo E, et al. Acute renal failure in critically ill patients: a multinational, multicenter study. Jama. 2005; 294(7):813–8.CrossRef
3.
go back to reference Xue Y, Liang H, Norbury J, Gillis R, Killingworth B. Predicting the risk of acute care readmissions among rehabilitation inpatients: A machine learning approach. J Biomed Inform. 2018; 86:143–8.CrossRef Xue Y, Liang H, Norbury J, Gillis R, Killingworth B. Predicting the risk of acute care readmissions among rehabilitation inpatients: A machine learning approach. J Biomed Inform. 2018; 86:143–8.CrossRef
4.
go back to reference Rojas J, Carey K, Edelson D, Venable L, Howell M, Churpek M. Predicting intensive care unit readmission with machine learning using electronic health record data. Ann Am Thorac Soc. 2018; 15(7):846–53.CrossRef Rojas J, Carey K, Edelson D, Venable L, Howell M, Churpek M. Predicting intensive care unit readmission with machine learning using electronic health record data. Ann Am Thorac Soc. 2018; 15(7):846–53.CrossRef
5.
go back to reference Churpek M, Yuen T, Winslow C, Meltzer D, Kattan M, Edelson D. Multicenter comparison of machine learning methods and conventional regression for predicting clinical deterioration on the wards. Crit Care Med. 2016; 44(2):368.CrossRef Churpek M, Yuen T, Winslow C, Meltzer D, Kattan M, Edelson D. Multicenter comparison of machine learning methods and conventional regression for predicting clinical deterioration on the wards. Crit Care Med. 2016; 44(2):368.CrossRef
6.
go back to reference Kourou K, Exarchos T, Exarchos K, Karamouzis M, Fotiadis D. Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J. 2015; 13:8–17.CrossRef Kourou K, Exarchos T, Exarchos K, Karamouzis M, Fotiadis D. Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J. 2015; 13:8–17.CrossRef
7.
go back to reference Koyner J, Carey K, Edelson D, Churpek M. The development of a machine learning inpatient acute kidney injury prediction model. Crit Care Med. 2018; 46(7):1070–7.CrossRef Koyner J, Carey K, Edelson D, Churpek M. The development of a machine learning inpatient acute kidney injury prediction model. Crit Care Med. 2018; 46(7):1070–7.CrossRef
8.
go back to reference Kumar M. Prediction of chronic kidney disease using random forest machine learning algorithm. Int J Comput Sci Mob Comput. 2016; 5(2):24–33. Kumar M. Prediction of chronic kidney disease using random forest machine learning algorithm. Int J Comput Sci Mob Comput. 2016; 5(2):24–33.
9.
go back to reference Kate R, Perez R, Mazumdar D, Pasupathy K, Nilakantan V. Prediction and detection models for acute kidney injury in hospitalized older adults. BMC Med Inform Decis Making. 2016; 16(1):39.CrossRef Kate R, Perez R, Mazumdar D, Pasupathy K, Nilakantan V. Prediction and detection models for acute kidney injury in hospitalized older adults. BMC Med Inform Decis Making. 2016; 16(1):39.CrossRef
10.
go back to reference Tomašev N, Glorot X, Rae J, Zielinski M, Askham H, Saraiva A, Mottram A, Meyer C, Ravuri S, Protsyuk I, et al.A clinically applicable approach to continuous prediction of future acute kidney injury. Nature. 2019; 572(7767):116–9.CrossRef Tomašev N, Glorot X, Rae J, Zielinski M, Askham H, Saraiva A, Mottram A, Meyer C, Ravuri S, Protsyuk I, et al.A clinically applicable approach to continuous prediction of future acute kidney injury. Nature. 2019; 572(7767):116–9.CrossRef
11.
go back to reference Weisenthal S, Liao H, Ng P, Zand M. Sum of previous inpatient serum creatinine measurements predicts acute kidney injury in rehospitalized patients. ArXiv. 2016; abs/1712.01880:1–9. Weisenthal S, Liao H, Ng P, Zand M. Sum of previous inpatient serum creatinine measurements predicts acute kidney injury in rehospitalized patients. ArXiv. 2016; abs/1712.01880:1–9.
12.
go back to reference Flechet M, Güiza F, Schetz M, Wouters P, Vanhorebeek I, Derese I, Gunst J, Spriet I, Casaer M, Van den Berghe G, et al.Akipredictor, an online prognostic calculator for acute kidney injury in adult critically ill patients: development, validation and comparison to serum neutrophil gelatinase-associated lipocalin. Intensive Care Med. 2017; 43(6):764–73.CrossRef Flechet M, Güiza F, Schetz M, Wouters P, Vanhorebeek I, Derese I, Gunst J, Spriet I, Casaer M, Van den Berghe G, et al.Akipredictor, an online prognostic calculator for acute kidney injury in adult critically ill patients: development, validation and comparison to serum neutrophil gelatinase-associated lipocalin. Intensive Care Med. 2017; 43(6):764–73.CrossRef
13.
go back to reference Wang Y, Wei Y, Wu Q, Yang H, Li J. An acute kidney injury prediction model based on ensemble learning algorithm. In: 2019 10th International Conference on Information Technology in Medicine and Education (ITME). IEEE: 2019. p. 18–22. Wang Y, Wei Y, Wu Q, Yang H, Li J. An acute kidney injury prediction model based on ensemble learning algorithm. In: 2019 10th International Conference on Information Technology in Medicine and Education (ITME). IEEE: 2019. p. 18–22.
14.
go back to reference Hundeshagen G, Herndon D, Capek K, Branski L, Voigt C, Killion E, Cambiaso-Daniel J, Sljivich M, De Crescenzo A, Mlcak R, et al.Co-administration of vancomycin and piperacillin-tazobactam is associated with increased renal dysfunction in adult and pediatric burn patients. Crit Care. 2017; 21(1):318.CrossRef Hundeshagen G, Herndon D, Capek K, Branski L, Voigt C, Killion E, Cambiaso-Daniel J, Sljivich M, De Crescenzo A, Mlcak R, et al.Co-administration of vancomycin and piperacillin-tazobactam is associated with increased renal dysfunction in adult and pediatric burn patients. Crit Care. 2017; 21(1):318.CrossRef
15.
go back to reference Chen T, Guestrin C. Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining. ACM: 2016. p. 785–94. Chen T, Guestrin C. Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining. ACM: 2016. p. 785–94.
16.
go back to reference Nguyen H, Bui X-N, Bui H-B, Cuong D. Developing an xgboost model to predict blast-induced peak particle velocity in an open-pit mine: a case study. Acta Geophysica. 2019; 67(2):477–90.CrossRef Nguyen H, Bui X-N, Bui H-B, Cuong D. Developing an xgboost model to predict blast-induced peak particle velocity in an open-pit mine: a case study. Acta Geophysica. 2019; 67(2):477–90.CrossRef
17.
go back to reference Ma X, Sha J, Wang D, Yu Y, Yang Q, Niu X. Study on a prediction of p2p network loan default based on the machine learning lightgbm and xgboost algorithms according to different high dimensional data cleaning. Electron Commer Res Appl. 2018; 31:24–39.CrossRef Ma X, Sha J, Wang D, Yu Y, Yang Q, Niu X. Study on a prediction of p2p network loan default based on the machine learning lightgbm and xgboost algorithms according to different high dimensional data cleaning. Electron Commer Res Appl. 2018; 31:24–39.CrossRef
18.
go back to reference Longadge R, Dongre S. Class imbalance problem in data mining review. ArXiv. 2013; abs/305.1707:1–6. Longadge R, Dongre S. Class imbalance problem in data mining review. ArXiv. 2013; abs/305.1707:1–6.
19.
go back to reference Johnson A, Pollard T, Shen L, Li-wei H, Feng M, Ghassemi M, Moody B, Szolovits P, Celi L, Mark R. Mimic-iii, a freely accessible critical care database. Sci Data. 2016; 3:160035.CrossRef Johnson A, Pollard T, Shen L, Li-wei H, Feng M, Ghassemi M, Moody B, Szolovits P, Celi L, Mark R. Mimic-iii, a freely accessible critical care database. Sci Data. 2016; 3:160035.CrossRef
20.
go back to reference Levey A, Eckardt K-U, Dorman N, Christiansen S, Hoorn E, Ingelfinger J, Inker L, Levin A, Mehrotra R, Palevsky P, et al.Nomenclature for kidney function and disease: Report of a kidney disease: Improving global outcomes (kdigo) consensus conference. Kidney Int. 2020; 97(6):1117–29.CrossRef Levey A, Eckardt K-U, Dorman N, Christiansen S, Hoorn E, Ingelfinger J, Inker L, Levin A, Mehrotra R, Palevsky P, et al.Nomenclature for kidney function and disease: Report of a kidney disease: Improving global outcomes (kdigo) consensus conference. Kidney Int. 2020; 97(6):1117–29.CrossRef
21.
go back to reference Estabrooks A, Jo T, Japkowicz N. A multiple resampling method for learning from imbalanced data sets. Comput Intell. 2004; 20(1):18–36.CrossRef Estabrooks A, Jo T, Japkowicz N. A multiple resampling method for learning from imbalanced data sets. Comput Intell. 2004; 20(1):18–36.CrossRef
22.
go back to reference He H, Garcia E. Learning from imbalanced data. IEEE Trans Knowl Data Eng. 2009; 21(9):1263–84.CrossRef He H, Garcia E. Learning from imbalanced data. IEEE Trans Knowl Data Eng. 2009; 21(9):1263–84.CrossRef
23.
go back to reference Shahrbaf F, Assadi F. Drug-induced renal disorders. J Ren Inj Prev. 2015; 4(3):57. Shahrbaf F, Assadi F. Drug-induced renal disorders. J Ren Inj Prev. 2015; 4(3):57.
24.
go back to reference Taber S, Mueller B. Drug-associated renal dysfunction. Crit Care Clin. 2006; 22(2):357–74.CrossRef Taber S, Mueller B. Drug-associated renal dysfunction. Crit Care Clin. 2006; 22(2):357–74.CrossRef
25.
go back to reference Huang C, Murugiah K, Mahajan S, Li S-X, Dhruva S, Haimovich J, Wang Y, Schulz W, Testani J, Wilson F, et al.Enhancing the prediction of acute kidney injury risk after percutaneous coronary intervention using machine learning techniques: A retrospective cohort study. PLoS Med. 2018; 15(11):1002703.CrossRef Huang C, Murugiah K, Mahajan S, Li S-X, Dhruva S, Haimovich J, Wang Y, Schulz W, Testani J, Wilson F, et al.Enhancing the prediction of acute kidney injury risk after percutaneous coronary intervention using machine learning techniques: A retrospective cohort study. PLoS Med. 2018; 15(11):1002703.CrossRef
26.
go back to reference Mohamadlou H, Lynng-Palevsky A, Barton C, Chettipally U, Shieh L, Calvert J, Saber N, Das R. Prediction of acute kidney injury with a machine learning algorithm using electronic health record data. Can J Kidney Health Dis. 2018; 5:2054358118776326.CrossRef Mohamadlou H, Lynng-Palevsky A, Barton C, Chettipally U, Shieh L, Calvert J, Saber N, Das R. Prediction of acute kidney injury with a machine learning algorithm using electronic health record data. Can J Kidney Health Dis. 2018; 5:2054358118776326.CrossRef
27.
go back to reference Alvarez-Melis D, Jaakkola T. On the robustness of interpretability methods. ArXiv. 2018; abs/1806.08049:1–6. Alvarez-Melis D, Jaakkola T. On the robustness of interpretability methods. ArXiv. 2018; abs/1806.08049:1–6.
29.
go back to reference Doshi-Velez F, Kim B. Towards a rigorous science of interpretable machine learning. ArXiv. 2017; abs/1702.08608:1–13. Doshi-Velez F, Kim B. Towards a rigorous science of interpretable machine learning. ArXiv. 2017; abs/1702.08608:1–13.
30.
go back to reference Pannu N, Nadim M. An overview of drug-induced acute kidney injury. Crit Care Med. 2008; 36(4):216–23.CrossRef Pannu N, Nadim M. An overview of drug-induced acute kidney injury. Crit Care Med. 2008; 36(4):216–23.CrossRef
Metadata
Title
Utilizing imbalanced electronic health records to predict acute kidney injury by ensemble learning and time series model
Authors
Yuan Wang
Yake Wei
Hao Yang
Jingwei Li
Yubo Zhou
Qin Wu
Publication date
01-12-2020
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2020
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/s12911-020-01245-4

Other articles of this Issue 1/2020

BMC Medical Informatics and Decision Making 1/2020 Go to the issue