Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2019

Open Access 01-12-2019 | Research article

Importance of medical data preprocessing in predictive modeling and risk factor discovery for the frailty syndrome

Authors: Andreas Philipp Hassler, Ernestina Menasalvas, Francisco José García-García, Leocadio Rodríguez-Mañas, Andreas Holzinger

Published in: BMC Medical Informatics and Decision Making | Issue 1/2019

Login to get access

Abstract

Background

Increasing life expectancy results in more elderly people struggling with age related diseases and functional conditions. This poses huge challenges towards establishing new approaches for maintaining health at a higher age. An important aspect for age related deterioration of the general patient condition is frailty. The frailty syndrome is associated with a high risk for falls, hospitalization, disability, and finally increased mortality. Using predictive data mining enables the discovery of potential risk factors and can be used as clinical decision support system, which provides the medical doctor with information on the probable clinical patient outcome. This enables the professional to react promptly and to avert likely adverse events in advance.

Methods

Medical data of 474 study participants containing 284 health related parameters, including questionnaire answers, blood parameters and vital parameters from the Toledo Study for Healthy Aging (TSHA) was used. Binary classification models were built in order to distinguish between frail and non-frail study subjects.

Results

Using the available TSHA data and the discovered potential predictors, it was possible to design, develop and evaluate a variety of different predictive models for the frailty syndrome. The best performing model was the support vector machine (SVM, 78.31%). Moreover, a methodology was developed, making it possible to explore and to use incomplete medical data and further identify potential predictors and enable interpretability.

Conclusions

This work demonstrates that it is feasible to use incomplete, imbalanced medical data for the development of a predictive model for the frailty syndrome. Moreover, potential predictive factors have been discovered, which were clinically approved by the clinicians. Future work will improve prediction accuracy, especially with regard to separating the group of frail patients into frail and pre-frail ones and analyze the differences among them.
Appendix
Available only for authorised users
Literature
2.
go back to reference Murray CJ, Vos T, Lozano R, Naghavi M, Flaxman AD, Michaud C, Ezzati M, Shibuya K, Salomon JA, Abdalla S, et al.Disability-adjusted life years (dalys) for 291 diseases and injuries in 21 regions, 1990–2010: a systematic analysis for the global burden of disease study 2010. Lancet. 2013; 380(9859):2197–223.CrossRef Murray CJ, Vos T, Lozano R, Naghavi M, Flaxman AD, Michaud C, Ezzati M, Shibuya K, Salomon JA, Abdalla S, et al.Disability-adjusted life years (dalys) for 291 diseases and injuries in 21 regions, 1990–2010: a systematic analysis for the global burden of disease study 2010. Lancet. 2013; 380(9859):2197–223.CrossRef
9.
go back to reference Bright TJ, Wong A, Dhurjati R, Bristow E, Bastian L, Coeytaux RR, Samsa G, Hasselblad V, Williams JW, Musty MD, et al. Effect of clinical decision-support systemsa systematic review. Ann Intern Med. 2012; 157(1):29–43.PubMedCrossRef Bright TJ, Wong A, Dhurjati R, Bristow E, Bastian L, Coeytaux RR, Samsa G, Hasselblad V, Williams JW, Musty MD, et al. Effect of clinical decision-support systemsa systematic review. Ann Intern Med. 2012; 157(1):29–43.PubMedCrossRef
10.
go back to reference Bose NK, Liang P. Neural network fundamentals with graphs, algorithms, and applications. New York: McGraw-Hill Inc.; 1996. Bose NK, Liang P. Neural network fundamentals with graphs, algorithms, and applications. New York: McGraw-Hill Inc.; 1996.
11.
12.
go back to reference Rish I, et al.An empirical study of the naive bayes classifier. In: IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, vol 3. New York: IBM: 2001. p. 41–6. Rish I, et al.An empirical study of the naive bayes classifier. In: IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, vol 3. New York: IBM: 2001. p. 41–6.
20.
go back to reference Makary MA, Segev DL, Pronovost PJ, Syin D, Bandeen-Roche K, Patel P, Takenaga R, Devgan L, Holzmueller CG, Tian J, et al.Frailty as a predictor of surgical outcomes in older patients. J Am Coll Surg. 2010; 210(6):901–8.PubMedCrossRef Makary MA, Segev DL, Pronovost PJ, Syin D, Bandeen-Roche K, Patel P, Takenaga R, Devgan L, Holzmueller CG, Tian J, et al.Frailty as a predictor of surgical outcomes in older patients. J Am Coll Surg. 2010; 210(6):901–8.PubMedCrossRef
29.
go back to reference Garcia-Garcia FJ, Avila GG, Alfaro-Acha A, Andres MSA, de la Torre Lanza MDLA, Aparicio MVE, Aparicio SH, Zugasti JLL, Reus MG-S, Rodriguez-Artalejo F, Rodriguez-Manas L. The prevalence of frailty syndrome in an older population from spain. the toledo study for healthy aging. J Nutr, Health Aging. 2011; 15(10):852–6. https://doi.org/10.1007/s12603-011-0075-8.CrossRef Garcia-Garcia FJ, Avila GG, Alfaro-Acha A, Andres MSA, de la Torre Lanza MDLA, Aparicio MVE, Aparicio SH, Zugasti JLL, Reus MG-S, Rodriguez-Artalejo F, Rodriguez-Manas L. The prevalence of frailty syndrome in an older population from spain. the toledo study for healthy aging. J Nutr, Health Aging. 2011; 15(10):852–6. https://​doi.​org/​10.​1007/​s12603-011-0075-8.CrossRef
30.
go back to reference Yesavage JA, Sheikh JI. 9/geriatric depression scale (GDS). Clin Gerontol. 1986; 5(1-2):165–73.CrossRef Yesavage JA, Sheikh JI. 9/geriatric depression scale (GDS). Clin Gerontol. 1986; 5(1-2):165–73.CrossRef
31.
go back to reference Yesavage JA, Brink TL, Rose TL, Lum O, Huang V, Adey M, Leirer VO. Development and validation of a geriatric depression screening scale: a preliminary report. J Psychiatr Res. 1983; 17(1):37–49.CrossRef Yesavage JA, Brink TL, Rose TL, Lum O, Huang V, Adey M, Leirer VO. Development and validation of a geriatric depression screening scale: a preliminary report. J Psychiatr Res. 1983; 17(1):37–49.CrossRef
34.
go back to reference Folstein MF, Folstein SE, McHugh PR. “mini-mental state”: a practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res. 1975; 12(3):189–98.PubMedCrossRef Folstein MF, Folstein SE, McHugh PR. “mini-mental state”: a practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res. 1975; 12(3):189–98.PubMedCrossRef
38.
go back to reference Agrawal R, Mannila H, Srikant R, Toivonen H, Verkamo AI, et al.Fast discovery of association rules. Adv Knowl Disc Data Min. 1996; 12(1):307–28. Agrawal R, Mannila H, Srikant R, Toivonen H, Verkamo AI, et al.Fast discovery of association rules. Adv Knowl Disc Data Min. 1996; 12(1):307–28.
39.
go back to reference Eknoyan Garabed. Adolphe Quetelet (1796–1874)–the average man and indices of obesity. Nephrol Dial Transplant. 2007; 23(1):47–51.PubMedCrossRef Eknoyan Garabed. Adolphe Quetelet (1796–1874)–the average man and indices of obesity. Nephrol Dial Transplant. 2007; 23(1):47–51.PubMedCrossRef
42.
go back to reference Kursa MB, Rudnicki WR. Feature selection with the boruta package. J Stat Softw. 2010; 36(11):1–13.CrossRef Kursa MB, Rudnicki WR. Feature selection with the boruta package. J Stat Softw. 2010; 36(11):1–13.CrossRef
44.
go back to reference Bradley AP. The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recogn. 1997; 30(7):1145–59.CrossRef Bradley AP. The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recogn. 1997; 30(7):1145–59.CrossRef
45.
go back to reference Sokolova M, Lapalme G. A systematic analysis of performance measures for classification tasks. Inf Process Manag. 2009; 45(4):427–37.CrossRef Sokolova M, Lapalme G. A systematic analysis of performance measures for classification tasks. Inf Process Manag. 2009; 45(4):427–37.CrossRef
46.
go back to reference Lippi G, Jansen-Duerr P, Viña J, Durrance-Bagale A, Abugessaisa I, Gomez-Cabrero D, Tegnér J, Grillari J, Erusalimsky J, Sinclair A, Rodriguez-Manãs L, on behalf of the FRAILOMIC consorti. Laboratory biomarkers and frailty: presentation of the FRAILOMIC initiative. Clin Chem Lab Med (CCLM). 2015; 53(10). https://doi.org/10.1515/cclm-2015-0147. Lippi G, Jansen-Duerr P, Viña J, Durrance-Bagale A, Abugessaisa I, Gomez-Cabrero D, Tegnér J, Grillari J, Erusalimsky J, Sinclair A, Rodriguez-Manãs L, on behalf of the FRAILOMIC consorti. Laboratory biomarkers and frailty: presentation of the FRAILOMIC initiative. Clin Chem Lab Med (CCLM). 2015; 53(10). https://​doi.​org/​10.​1515/​cclm-2015-0147.
47.
go back to reference Caraviello D, Weigel K, Craven M, Gianola D, Cook N, Nordlund K, Fricke P, Wiltbank M. Analysis of reproductive performance of lactating cows on large dairy farms using machine learning algorithms. J Dairy Sci. 2006; 89(12):4703–22.PubMedCrossRef Caraviello D, Weigel K, Craven M, Gianola D, Cook N, Nordlund K, Fricke P, Wiltbank M. Analysis of reproductive performance of lactating cows on large dairy farms using machine learning algorithms. J Dairy Sci. 2006; 89(12):4703–22.PubMedCrossRef
48.
go back to reference Kotsiantis SB, Zaharakis I, Pintelas P. Supervised machine learning: A review of classification techniques. In: Proceedings of the 2007 Conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies. Amsterdam: IOS Press; 2007. p. 3–24. http://dl.acm.org/citation.cfm?id=1566770.1566773. Kotsiantis SB, Zaharakis I, Pintelas P. Supervised machine learning: A review of classification techniques. In: Proceedings of the 2007 Conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies. Amsterdam: IOS Press; 2007. p. 3–24. http://​dl.​acm.​org/​citation.​cfm?​id=​1566770.​1566773.
54.
go back to reference Kursa MB, Jankowski A, Rudnicki WR. Boruta–a system for feature selection. Fundam Informaticae. 2010; 101(4):271–85.CrossRef Kursa MB, Jankowski A, Rudnicki WR. Boruta–a system for feature selection. Fundam Informaticae. 2010; 101(4):271–85.CrossRef
Metadata
Title
Importance of medical data preprocessing in predictive modeling and risk factor discovery for the frailty syndrome
Authors
Andreas Philipp Hassler
Ernestina Menasalvas
Francisco José García-García
Leocadio Rodríguez-Mañas
Andreas Holzinger
Publication date
01-12-2019
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2019
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/s12911-019-0747-6

Other articles of this Issue 1/2019

BMC Medical Informatics and Decision Making 1/2019 Go to the issue