Skip to main content
Top
Published in: International Journal of Diabetes in Developing Countries 4/2016

01-12-2016 | Original Article

Impact of selected pre-processing techniques on prediction of risk of early readmission for diabetic patients in India

Authors: Reena Duggal, Suren Shukla, Sarika Chandra, Balvinder Shukla, Sunil Kumar Khatri

Published in: International Journal of Diabetes in Developing Countries | Issue 4/2016

Login to get access

Abstract

Diabetes is associated with increased risk of hospital readmission. Predicting risk of readmission of diabetic patients can facilitate implementing appropriate plans to prevent these readmissions. But the real-world medical data is noisy, inconsistent, and incomplete. So before building the prediction model, it is essential to pre-process the data efficiently and make it appropriate for predictive modelling. The objective of this study is to assess the impact of selected pre-processing techniques on the prediction of risk of 30-day readmission among patients with diabetes in India. De-identified electronic medical records data was used from a reputed hospital in the National Capital Region in India and included diabetes patients ≥18 years old discharged from hospital in 2012 to 2015 (n = 9381). This paper focused on data pre-processing steps to improve readmission prediction outcomes. The impact of different pre-processing choices including feature selection, missing value imputation and data balancing on the classifier performance of logistic regression, Naïve Bayes, and decision tree was assessed on various performance metrics such as area under curve, precision, recall, and accuracy. This comprehensive experimental study, first time done from Indian healthcare perspective, offered empirical evidence that most proposed models with pre-processing techniques significantly outperform the baseline methods (without any pre-processing) with respect to selected evaluation criteria. Area under curve (AUC) was highly increased with the use of oversampling technique as data is skewed on class label Readmission. Recall was the biggest gainer with range increasing from 0.02–0.23 to 0.78–0.85, and there was also an increase in AUC from range 0.56–0.68 to 0.83–0.86 by using pre-processing approach. Data pre-processing has a significant effect on hospital readmission predictive accuracy for patients with diabetes, with certain schemes proving inferior to competitive approaches. In addition, it is found that the impact of pre-processing schemes varies by technique, signifying formulation of different best practices to aid better results of a specific technique.
Literature
1.
go back to reference Dungan KM. The effect of diabetes on hospital readmissions. J Diabet Sci Technol. 2012;6(5):1045–52.CrossRef Dungan KM. The effect of diabetes on hospital readmissions. J Diabet Sci Technol. 2012;6(5):1045–52.CrossRef
3.
go back to reference Silverstein MD, Qin H, Mercer SQ, Fong J, Haydar Z. Risk factors for 30-day hospital readmission in patients? 65 years of age. In Baylor University Medical Center. Proceedings 2008; 21 Suppl 4:363. Baylor University Medical Center. Silverstein MD, Qin H, Mercer SQ, Fong J, Haydar Z. Risk factors for 30-day hospital readmission in patients? 65 years of age. In Baylor University Medical Center. Proceedings 2008; 21 Suppl 4:363. Baylor University Medical Center.
4.
go back to reference Strack B, DeShazo JP, Gennings C, Olmo JL, Ventura S, Cios KJ, Clore JN. Impact of HbA1c measurement on hospital readmission rates: analysis of 70,000 clinical database patient records. BioMed Res Int. 2014;3:2014. Strack B, DeShazo JP, Gennings C, Olmo JL, Ventura S, Cios KJ, Clore JN. Impact of HbA1c measurement on hospital readmission rates: analysis of 70,000 clinical database patient records. BioMed Res Int. 2014;3:2014.
5.
go back to reference Eby E, Hardwick C, Yu M, Gelwicks S, Deschamps K, Xie J, George T. Predictors of 30 day hospital readmission in patients with type 2 diabetes: a retrospective, case–control, database study. Curr Med Res Opin. 2015;31(1):107–14.CrossRefPubMed Eby E, Hardwick C, Yu M, Gelwicks S, Deschamps K, Xie J, George T. Predictors of 30 day hospital readmission in patients with type 2 diabetes: a retrospective, case–control, database study. Curr Med Res Opin. 2015;31(1):107–14.CrossRefPubMed
6.
go back to reference Rubin DJ. Hospital readmission of patients with diabetes. Curr Diabet Rep. 2015;15(4):1–9.CrossRef Rubin DJ. Hospital readmission of patients with diabetes. Curr Diabet Rep. 2015;15(4):1–9.CrossRef
7.
go back to reference Rubin DJ, Donnell-Jackson K, Jhingan R, Golden SH, Paranjape A. Early readmission among patients with diabetes: a qualitative assessment of contributing factors. J Diabet Complicat. 2014;28(6):869–73.CrossRef Rubin DJ, Donnell-Jackson K, Jhingan R, Golden SH, Paranjape A. Early readmission among patients with diabetes: a qualitative assessment of contributing factors. J Diabet Complicat. 2014;28(6):869–73.CrossRef
8.
go back to reference Billings J, Dixon J, Mijanovich T, Wennberg D. Case finding for patients at risk of readmission to hospital: development of algorithm to identify high risk patients. BMJ. 2006;333(7563):327.CrossRefPubMedPubMedCentral Billings J, Dixon J, Mijanovich T, Wennberg D. Case finding for patients at risk of readmission to hospital: development of algorithm to identify high risk patients. BMJ. 2006;333(7563):327.CrossRefPubMedPubMedCentral
10.
go back to reference Donnan PT, Dorward DW, Mutch B, Morris AD. Development and validation of a model for predicting emergency admissions over the next year (PEONY): a UK historical cohort study. Arch Int Med. 2008;168(13):1416–22.CrossRef Donnan PT, Dorward DW, Mutch B, Morris AD. Development and validation of a model for predicting emergency admissions over the next year (PEONY): a UK historical cohort study. Arch Int Med. 2008;168(13):1416–22.CrossRef
11.
go back to reference van Walraven C, Wong J, Hawken S, Forster AJ. Comparing methods to calculate hospital-specific rates of early death or urgent readmission. Can Med Assoc J. 2012;184(15):E810–7.CrossRef van Walraven C, Wong J, Hawken S, Forster AJ. Comparing methods to calculate hospital-specific rates of early death or urgent readmission. Can Med Assoc J. 2012;184(15):E810–7.CrossRef
12.
go back to reference Donzé J, Aujesky D, Williams D, Schnipper JL. Potentially avoidable 30-day hospital readmissions in medical patients: derivation and validation of a prediction model. JAMA Int Med. 2013;173(8):632–8.CrossRef Donzé J, Aujesky D, Williams D, Schnipper JL. Potentially avoidable 30-day hospital readmissions in medical patients: derivation and validation of a prediction model. JAMA Int Med. 2013;173(8):632–8.CrossRef
13.
go back to reference van Walraven C, Dhalla IA, Bell C, Etchells E, Stiell IG, Zarnke K, Austin PC, Forster AJ. Derivation and validation of an index to predict early death or unplanned readmission after discharge from hospital to the community. Can Med Assoc J. 2010;182(6):551–7.CrossRef van Walraven C, Dhalla IA, Bell C, Etchells E, Stiell IG, Zarnke K, Austin PC, Forster AJ. Derivation and validation of an index to predict early death or unplanned readmission after discharge from hospital to the community. Can Med Assoc J. 2010;182(6):551–7.CrossRef
14.
go back to reference Billings J, Blunt I, Steventon A, Georghiou T, Lewis G, Bardsley M. Development of a predictive model to identify inpatients at risk of re-admission within 30 days of discharge (PARR-30). BMJ open. 2012;2(4):e001667.CrossRefPubMedPubMedCentral Billings J, Blunt I, Steventon A, Georghiou T, Lewis G, Bardsley M. Development of a predictive model to identify inpatients at risk of re-admission within 30 days of discharge (PARR-30). BMJ open. 2012;2(4):e001667.CrossRefPubMedPubMedCentral
15.
go back to reference AbdelRahman SE, Zhang M, Bray BE, Kawamoto K. A three-step approach for the derivation and validation of high-performing predictive models using an operational dataset: congestive heart failure readmission case study. BMC Med Inform Decis Making. 2014;14(1):1.CrossRef AbdelRahman SE, Zhang M, Bray BE, Kawamoto K. A three-step approach for the derivation and validation of high-performing predictive models using an operational dataset: congestive heart failure readmission case study. BMC Med Inform Decis Making. 2014;14(1):1.CrossRef
16.
go back to reference Meadem N, Verbiest N, Zolfaghar K, Agarwal J, Chin SC, Roy SB. Exploring preprocessing techniques for prediction of risk of readmission for congestive heart failure patients. In Data mining and healthcare (DMH), at International Conference on Knowledge Discovery and Data Mining (KDD) 2013. Meadem N, Verbiest N, Zolfaghar K, Agarwal J, Chin SC, Roy SB. Exploring preprocessing techniques for prediction of risk of readmission for congestive heart failure patients. In Data mining and healthcare (DMH), at International Conference on Knowledge Discovery and Data Mining (KDD) 2013.
17.
go back to reference Duggal R, Khatri SK, Shukla B. Improving patient matching: single patient view for clinical decision support using Big Data analytics. In Reliability, Infocom Technologies and Optimization (ICRITO) (Trends and Future Directions), 2015 4th International Conference on 2015 Sep 2 (pp. 1–6). IEEE. Duggal R, Khatri SK, Shukla B. Improving patient matching: single patient view for clinical decision support using Big Data analytics. In Reliability, Infocom Technologies and Optimization (ICRITO) (Trends and Future Directions), 2015 4th International Conference on 2015 Sep 2 (pp. 1–6). IEEE.
18.
go back to reference Duggal, Reena, Shukla, B. & Khatri, S. K. Big Data Analytics in Indian healthcare system—opportunities and challenges, National Conference on Computing, Communication and Information Processing 2015 (NCCCIP-2015), ISBN: 978–93–84935-27-6, (DOI: NCCIP2015/NERIST/02/03–05-2015/CP28). Duggal, Reena, Shukla, B. & Khatri, S. K. Big Data Analytics in Indian healthcare system—opportunities and challenges, National Conference on Computing, Communication and Information Processing 2015 (NCCCIP-2015), ISBN: 978–93–84935-27-6, (DOI: NCCIP2015/NERIST/02/03–05-2015/CP28).
19.
go back to reference Chen JY, Ma Q, Chen H, Yermilov I. New bundled world: quality of care and readmission in diabetes patients. J Diabet Sci Technol. 2012;6(3):563–71.CrossRef Chen JY, Ma Q, Chen H, Yermilov I. New bundled world: quality of care and readmission in diabetes patients. J Diabet Sci Technol. 2012;6(3):563–71.CrossRef
20.
go back to reference Radovanovic S, Vukicevic M, Kovacevic A, Stiglic G, Obradovic Z. Domain knowledge based hierarchical feature selection for 30-day hospital readmission prediction. In Artificial intelligence in medicine. Springer International Publishing; 2015 pp. 96–100. Radovanovic S, Vukicevic M, Kovacevic A, Stiglic G, Obradovic Z. Domain knowledge based hierarchical feature selection for 30-day hospital readmission prediction. In Artificial intelligence in medicine. Springer International Publishing; 2015 pp. 96–100.
21.
go back to reference Hosseinzadeh A, Izadi M, Verma A, Precup D, Buckeridge D. Assessing the predictability of hospital readmission using machine learning. In Twenty-Fifth IAAI Conference; 2013. Hosseinzadeh A, Izadi M, Verma A, Precup D, Buckeridge D. Assessing the predictability of hospital readmission using machine learning. In Twenty-Fifth IAAI Conference; 2013.
22.
go back to reference Shams I, Ajorlou S, Yang K. A predictive analytics approach to reducing 30-day avoidable readmissions among patients with heart failure, acute myocardial infarction, pneumonia, or COPD. Health Care Manag Sci. 2015;18(1):19–34.CrossRefPubMed Shams I, Ajorlou S, Yang K. A predictive analytics approach to reducing 30-day avoidable readmissions among patients with heart failure, acute myocardial infarction, pneumonia, or COPD. Health Care Manag Sci. 2015;18(1):19–34.CrossRefPubMed
23.
go back to reference Zolfaghar K, Verbiest N, Agarwal J, Meadem N, Chin SC, Roy SB, Teredesai A, Hazel D, Amoroso P, Reed L. Predicting risk-of-readmission for congestive heart failure patients: a multi-layer approach. arXiv preprint arXiv:1306.2094. 2013. Zolfaghar K, Verbiest N, Agarwal J, Meadem N, Chin SC, Roy SB, Teredesai A, Hazel D, Amoroso P, Reed L. Predicting risk-of-readmission for congestive heart failure patients: a multi-layer approach. arXiv preprint arXiv:1306.2094. 2013.
24.
go back to reference Braga P, Portela F, Santos MF, Rua F. Data mining models to predict patient’s readmission in intensive care units. Braga P, Portela F, Santos MF, Rua F. Data mining models to predict patient’s readmission in intensive care units.
25.
go back to reference Vukicevic M, Radovanovic S, Kovacevic A, Stiglic G, Obradovic Z. Improving hospital readmission prediction using domain knowledge based virtual examples. In Knowledge management in organizations Springer International Publishing; 2015 pp. 695–706. Vukicevic M, Radovanovic S, Kovacevic A, Stiglic G, Obradovic Z. Improving hospital readmission prediction using domain knowledge based virtual examples. In Knowledge management in organizations Springer International Publishing; 2015 pp. 695–706.
26.
go back to reference Han J, Kamber M. Data mining. 2nd ed. Amsterdam: Elsevier; 2006. p. 72–85 .310-317 Han J, Kamber M. Data mining. 2nd ed. Amsterdam: Elsevier; 2006. p. 72–85 .310-317
27.
go back to reference Hall MA, Smith LA. Feature subset selection: a correlation based filter approach. In International Conference on Neural Information Processing and Intelligent Information Systems; 1997 pp. 855–858. Hall MA, Smith LA. Feature subset selection: a correlation based filter approach. In International Conference on Neural Information Processing and Intelligent Information Systems; 1997 pp. 855–858.
28.
go back to reference Peng L, Lei L. A review of missing data treatment methods. Intell Inf Manag Syst Technol. 2005;1(3):412–9. Peng L, Lei L. A review of missing data treatment methods. Intell Inf Manag Syst Technol. 2005;1(3):412–9.
29.
go back to reference Su X, Khoshgoftaar TM, Greiner R. Using imputation techniques to help learn accurate classifiers. In Tools with artificial intelligence, 2008. ICTAI’08. 20th IEEE International Conference on 2008; 1:437–444. IEEE. Su X, Khoshgoftaar TM, Greiner R. Using imputation techniques to help learn accurate classifiers. In Tools with artificial intelligence, 2008. ICTAI’08. 20th IEEE International Conference on 2008; 1:437–444. IEEE.
30.
go back to reference Hosmer Jr DW, Lemeshow S. Applied logistic regression. 2nd ed. John Wiley & Sons; 2004. Hosmer Jr DW, Lemeshow S. Applied logistic regression. 2nd ed. John Wiley & Sons; 2004.
32.
go back to reference Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The WEKA data mining software: an update. ACM SIGKDD Explor Newsletter. 2009;11(1):10–8.CrossRef Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The WEKA data mining software: an update. ACM SIGKDD Explor Newsletter. 2009;11(1):10–8.CrossRef
33.
go back to reference Zolfaghar K, Meadem N, Teredesai A, Roy SB, Chin SC, Muckian B. Big data solutions for predicting risk-of-readmission for congestive heart failure patients. InBig Data, 2013 I.E. International Conference on; 2013pp. 64–71. IEEE. Zolfaghar K, Meadem N, Teredesai A, Roy SB, Chin SC, Muckian B. Big data solutions for predicting risk-of-readmission for congestive heart failure patients. InBig Data, 2013 I.E. International Conference on; 2013pp. 64–71. IEEE.
34.
go back to reference Chin SC, Zolfaghar K, Roy SB, Teredesai A, Amoroso P. Divide-n-Discover discretization based data exploration framework for healthcare analytics. Healthinf 2014; 329-333. Chin SC, Zolfaghar K, Roy SB, Teredesai A, Amoroso P. Divide-n-Discover discretization based data exploration framework for healthcare analytics. Healthinf 2014; 329-333.
Metadata
Title
Impact of selected pre-processing techniques on prediction of risk of early readmission for diabetic patients in India
Authors
Reena Duggal
Suren Shukla
Sarika Chandra
Balvinder Shukla
Sunil Kumar Khatri
Publication date
01-12-2016
Publisher
Springer India
Published in
International Journal of Diabetes in Developing Countries / Issue 4/2016
Print ISSN: 0973-3930
Electronic ISSN: 1998-3832
DOI
https://doi.org/10.1007/s13410-016-0495-4

Other articles of this Issue 4/2016

International Journal of Diabetes in Developing Countries 4/2016 Go to the issue
Live Webinar | 27-06-2024 | 18:00 (CEST)

Keynote webinar | Spotlight on medication adherence

Live: Thursday 27th June 2024, 18:00-19:30 (CEST)

WHO estimates that half of all patients worldwide are non-adherent to their prescribed medication. The consequences of poor adherence can be catastrophic, on both the individual and population level.

Join our expert panel to discover why you need to understand the drivers of non-adherence in your patients, and how you can optimize medication adherence in your clinics to drastically improve patient outcomes.

Prof. Kevin Dolgin
Prof. Florian Limbourg
Prof. Anoop Chauhan
Developed by: Springer Medicine
Obesity Clinical Trial Summary

At a glance: The STEP trials

A round-up of the STEP phase 3 clinical trials evaluating semaglutide for weight loss in people with overweight or obesity.

Developed by: Springer Medicine

Highlights from the ACC 2024 Congress

Year in Review: Pediatric cardiology

Watch Dr. Anne Marie Valente present the last year's highlights in pediatric and congenital heart disease in the official ACC.24 Year in Review session.

Year in Review: Pulmonary vascular disease

The last year's highlights in pulmonary vascular disease are presented by Dr. Jane Leopold in this official video from ACC.24.

Year in Review: Valvular heart disease

Watch Prof. William Zoghbi present the last year's highlights in valvular heart disease from the official ACC.24 Year in Review session.

Year in Review: Heart failure and cardiomyopathies

Watch this official video from ACC.24. Dr. Biykem Bozkurt discusses last year's major advances in heart failure and cardiomyopathies.