Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2010

Open Access 01-12-2010 | Research article

Regression tree construction by bootstrap: Model search for DRG-systems applied to Austrian health-data

Authors: Thomas Grubinger, Conrad Kobel, Karl-Peter Pfeiffer

Published in: BMC Medical Informatics and Decision Making | Issue 1/2010

Login to get access

Abstract

Background

DRG-systems are used to allocate resources fairly to hospitals based on their performance. Statistically, this allocation is based on simple rules that can be modeled with regression trees. However, the resulting models often have to be adjusted manually to be medically reasonable and ethical.

Methods

Despite the possibility of manual, performance degenerating adaptations of the original model, alternative trees are systematically searched. The bootstrap-based method bumping is used to build diverse and accurate regression tree models for DRG-systems. A two-step model selection approach is proposed. First, a reasonable model complexity is chosen, based on statistical, medical and economical considerations. Second, a medically meaningful and accurate model is selected. An analysis of 8 data-sets from Austrian DRG-data is conducted and evaluated based on the possibility to produce diverse and accurate models for predefined tree complexities.

Results

The best bootstrap-based trees offer increased predictive accuracy compared to the trees built by the CART algorithm. The analysis demonstrates that even for very small tree sizes, diverse models can be constructed being equally or even more accurate than the single model built by the standard CART algorithm.

Conclusions

Bumping is a powerful tool to construct diverse and accurate regression trees, to be used as candidate models for DRG-systems. Furthermore, Bumping and the proposed model selection approach are also applicable to other medical decision and prognosis tasks.
Appendix
Available only for authorised users
Literature
4.
go back to reference Fischer W: Diagnosis Related Groups (DRG's) und Verwandte Patientenklassifikationssysteme. 2000, Wolfertswil: Zentrum für Informatik und wirtschaftliche Medizin Fischer W: Diagnosis Related Groups (DRG's) und Verwandte Patientenklassifikationssysteme. 2000, Wolfertswil: Zentrum für Informatik und wirtschaftliche Medizin
5.
go back to reference Institut für das Entgeltsystem im Krankenhaus GmbH: German Diagnosis Related Groups Definitionshandbuch. Siegburg: Deutsche Krankenhaus Verlagsgesellschaft GmbH. 2005 Institut für das Entgeltsystem im Krankenhaus GmbH: German Diagnosis Related Groups Definitionshandbuch. Siegburg: Deutsche Krankenhaus Verlagsgesellschaft GmbH. 2005
6.
go back to reference Tibshirani R, Knight K: Model Search by Bootstrap "Bumping". Journal of Computational and Graphical Statistics. 1999, 8 (4): 671-686. 10.2307/1390820. Tibshirani R, Knight K: Model Search by Bootstrap "Bumping". Journal of Computational and Graphical Statistics. 1999, 8 (4): 671-686. 10.2307/1390820.
7.
go back to reference Breiman L, Friedman J, Olshen R, Stone C: Classification and Regression Trees. 1984, Belmont: Wadsworth Breiman L, Friedman J, Olshen R, Stone C: Classification and Regression Trees. 1984, Belmont: Wadsworth
8.
go back to reference Theurl E, Winner H: The impact of hospital financing on the length of stay: Evidence from Austria. Health policy. 2007, 82 (3): 375-389. 10.1016/j.healthpol.2006.11.001.CrossRefPubMed Theurl E, Winner H: The impact of hospital financing on the length of stay: Evidence from Austria. Health policy. 2007, 82 (3): 375-389. 10.1016/j.healthpol.2006.11.001.CrossRefPubMed
9.
go back to reference Quinlan J: Learning with continuous classes. In Proceedings of the 5th Australian Joint Conference on Artificial Intelligence. 1992, 343-348. Quinlan J: Learning with continuous classes. In Proceedings of the 5th Australian Joint Conference on Artificial Intelligence. 1992, 343-348.
10.
go back to reference Breiman L: Random Forests. Machine Learning. 2001, 45: 5-32. 10.1023/A:1010933404324.CrossRef Breiman L: Random Forests. Machine Learning. 2001, 45: 5-32. 10.1023/A:1010933404324.CrossRef
11.
go back to reference Suárez A, Lutsko J: Globally Optimal Fuzzy Decision Trees for Classification and Regression. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1999, 21 (12): 1297-1311. 10.1109/34.817409.CrossRef Suárez A, Lutsko J: Globally Optimal Fuzzy Decision Trees for Classification and Regression. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1999, 21 (12): 1297-1311. 10.1109/34.817409.CrossRef
12.
go back to reference Shannon W, Banks D: Combining classification trees using MLE. Statistics in Medicine. 1999, 18 (6): 727-740. 10.1002/(SICI)1097-0258(19990330)18:6<727::AID-SIM61>3.0.CO;2-2.CrossRefPubMed Shannon W, Banks D: Combining classification trees using MLE. Statistics in Medicine. 1999, 18 (6): 727-740. 10.1002/(SICI)1097-0258(19990330)18:6<727::AID-SIM61>3.0.CO;2-2.CrossRefPubMed
13.
go back to reference Vogel D, Asparouhov O, Scheffer T: Scalable look-ahead linear regression trees. Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. 2007, ACM Press New York, NY, USA, 757-764. full_text.CrossRef Vogel D, Asparouhov O, Scheffer T: Scalable look-ahead linear regression trees. Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. 2007, ACM Press New York, NY, USA, 757-764. full_text.CrossRef
14.
go back to reference Murthy S, Salzberg S: Lookahead and pathology in decision tree induction. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence. 1995, 1025-1031. Murthy S, Salzberg S: Lookahead and pathology in decision tree induction. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence. 1995, 1025-1031.
15.
go back to reference Quinlan J, Cameron-Jones R: Oversearching and Layered Search in Empirical Learning. Breast Cancer. 1995, 286: 2-7. Quinlan J, Cameron-Jones R: Oversearching and Layered Search in Empirical Learning. Breast Cancer. 1995, 286: 2-7.
16.
go back to reference Esmeir S, Markovitch S: Anytime Learning of Decision Trees. The Journal of Machine Learning Research. 2007, 8: 891-933. Esmeir S, Markovitch S: Anytime Learning of Decision Trees. The Journal of Machine Learning Research. 2007, 8: 891-933.
17.
go back to reference Esmeir S, Markovitch S: Lookahead-based algorithms for anytime induction of decision trees. ACM International Conference Proceeding Series. 2004, ACM Press New York, NY, USA, 257-264. Esmeir S, Markovitch S: Lookahead-based algorithms for anytime induction of decision trees. ACM International Conference Proceeding Series. 2004, ACM Press New York, NY, USA, 257-264.
18.
go back to reference Norton S: Generating better decision trees. Proceedings of the Eleventh International Conference on Artificial Intelligence. 1989, 800-805. Norton S: Generating better decision trees. Proceedings of the Eleventh International Conference on Artificial Intelligence. 1989, 800-805.
20.
go back to reference Fan G, Gray J: Regression tree analysis using TARGET. Journal of Computational and Graphical Statistics. 2005, 14: 206-218. 10.1198/106186005X37210.CrossRef Fan G, Gray J: Regression tree analysis using TARGET. Journal of Computational and Graphical Statistics. 2005, 14: 206-218. 10.1198/106186005X37210.CrossRef
21.
go back to reference Chipman H, George E, McCulloch R: Bayesian CART Model Search. Journal of the American Statistical Association. 1998, 93: 935-947. 10.2307/2669832.CrossRef Chipman H, George E, McCulloch R: Bayesian CART Model Search. Journal of the American Statistical Association. 1998, 93: 935-947. 10.2307/2669832.CrossRef
22.
go back to reference Denison D, Mallick B, Smith A: A Bayesian CART algorithm. Biometrika. 1998, 85 (2): 363-377. 10.1093/biomet/85.2.363.CrossRef Denison D, Mallick B, Smith A: A Bayesian CART algorithm. Biometrika. 1998, 85 (2): 363-377. 10.1093/biomet/85.2.363.CrossRef
23.
go back to reference Sutton C: Improving Classification Trees with Simulated Annealing. Proceedings of the 23rd Symposium on the Interface, Interface Foundation of North America. 1992, 333-44. Sutton C: Improving Classification Trees with Simulated Annealing. Proceedings of the 23rd Symposium on the Interface, Interface Foundation of North America. 1992, 333-44.
24.
go back to reference Youssef H, M Sait S, Adiche H: Evolutionary algorithms, simulated annealing and tabu search: a comparative study. Engineering Applications of Artificial Intelligence. 2001, 14 (2): 167-181. 10.1016/S0952-1976(00)00065-8.CrossRef Youssef H, M Sait S, Adiche H: Evolutionary algorithms, simulated annealing and tabu search: a comparative study. Engineering Applications of Artificial Intelligence. 2001, 14 (2): 167-181. 10.1016/S0952-1976(00)00065-8.CrossRef
25.
go back to reference Kalles D: Lossless fitness inheritance in genetic algorithms for decision trees. Arxiv preprint cs/0611166. 2006 Kalles D: Lossless fitness inheritance in genetic algorithms for decision trees. Arxiv preprint cs/0611166. 2006
26.
go back to reference Jin Y: A comprehensive survey of fitness approximation in evolutionary computation. Soft Computing-A Fusion of Foundations, Methodologies and Applications. 2005, 9: 3-12. Jin Y: A comprehensive survey of fitness approximation in evolutionary computation. Soft Computing-A Fusion of Foundations, Methodologies and Applications. 2005, 9: 3-12.
27.
go back to reference Breiman L: Bagging predictors. Machine Learning. 1996, 24 (2): 123-140. Breiman L: Bagging predictors. Machine Learning. 1996, 24 (2): 123-140.
28.
go back to reference Friedman J: Greedy function approximation: a gradient boosting machine. Annals of Statistics. 2001, 29 (5): 1189-1232. 10.1214/aos/1013203451.CrossRef Friedman J: Greedy function approximation: a gradient boosting machine. Annals of Statistics. 2001, 29 (5): 1189-1232. 10.1214/aos/1013203451.CrossRef
29.
go back to reference Gao H, Davis J: Sampling Representative Examples for Dimensionality Reduction and Recognition-Bootsrap Bumping LDA. Lecture Nodes in Computer Science. 2006, 3953: 275-287. full_text.CrossRef Gao H, Davis J: Sampling Representative Examples for Dimensionality Reduction and Recognition-Bootsrap Bumping LDA. Lecture Nodes in Computer Science. 2006, 3953: 275-287. full_text.CrossRef
30.
go back to reference Heskes T: Balancing between bagging and bumping. Advances in Neural Information Processing Systems 9. 1997, MIT Press, 466-472. Heskes T: Balancing between bagging and bumping. Advances in Neural Information Processing Systems 9. 1997, MIT Press, 466-472.
31.
go back to reference Petrikieva L, Fyfe C: Bagging and bumping self-organising maps. Computing and Information Systems. 2002, 9 (2): 69- Petrikieva L, Fyfe C: Bagging and bumping self-organising maps. Computing and Information Systems. 2002, 9 (2): 69-
33.
go back to reference Therneau T, Atkinson E: An introduction to recursive partitioning using the RPART routines. Mayo Foundation. 1997 Therneau T, Atkinson E: An introduction to recursive partitioning using the RPART routines. Mayo Foundation. 1997
34.
go back to reference Hastie T, Tibshirani R, Friedman J: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2008, Springer Hastie T, Tibshirani R, Friedman J: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2008, Springer
35.
go back to reference Bundesministerium für Gesundheit, Familie und Jugend: Bundesministerium für Gesundheit, Familie und Jugend. [accessed on December 23th 2009], [http://bmg.gv.at] Bundesministerium für Gesundheit, Familie und Jugend: Bundesministerium für Gesundheit, Familie und Jugend. [accessed on December 23th 2009], [http://​bmg.​gv.​at]
36.
go back to reference Chipman H, George E, McCulloch R: Making sense of a forest of trees. Proceedings of the 30th Symposium on the Interface. 1998, 84-92. Chipman H, George E, McCulloch R: Making sense of a forest of trees. Proceedings of the 30th Symposium on the Interface. 1998, 84-92.
37.
go back to reference Miglio R, Soffritti G: The comparison between classification trees through proximity measures. Computational Statistics and Data Analysis. 2004, 45 (3): 577-593. 10.1016/S0167-9473(03)00063-X.CrossRef Miglio R, Soffritti G: The comparison between classification trees through proximity measures. Computational Statistics and Data Analysis. 2004, 45 (3): 577-593. 10.1016/S0167-9473(03)00063-X.CrossRef
38.
go back to reference Ji S, Smith R, Huynh T, Najarian K: A comparative analysis of multi-level computer-assisted decision making systems for traumatic injuries. BMC Medical Informatics and Decision Making. 2009, 9: 2-10.1186/1472-6947-9-2.CrossRefPubMedPubMedCentral Ji S, Smith R, Huynh T, Najarian K: A comparative analysis of multi-level computer-assisted decision making systems for traumatic injuries. BMC Medical Informatics and Decision Making. 2009, 9: 2-10.1186/1472-6947-9-2.CrossRefPubMedPubMedCentral
39.
go back to reference Toussi M, Lamy J, Le Toumelin P, Venot A: Using data mining techniques to explore physicians' therapeutic decisions when clinical guidelines do not provide recommendations: methods and example for type 2 diabetes. BMC Medical Informatics and Decision Making. 2009, 9: 28-10.1186/1472-6947-9-28.CrossRefPubMedPubMedCentral Toussi M, Lamy J, Le Toumelin P, Venot A: Using data mining techniques to explore physicians' therapeutic decisions when clinical guidelines do not provide recommendations: methods and example for type 2 diabetes. BMC Medical Informatics and Decision Making. 2009, 9: 28-10.1186/1472-6947-9-28.CrossRefPubMedPubMedCentral
40.
go back to reference Barrett J, Mondick J, Narayan M, Vijayakumar K, Vijayakumar S: Integration of modeling and simulation into hospital-based decision support systems guiding pediatric pharmacotherapy. BMC Medical Informatics and Decision Making. 2008, 8: 6-10.1186/1472-6947-8-6.CrossRefPubMedPubMedCentral Barrett J, Mondick J, Narayan M, Vijayakumar K, Vijayakumar S: Integration of modeling and simulation into hospital-based decision support systems guiding pediatric pharmacotherapy. BMC Medical Informatics and Decision Making. 2008, 8: 6-10.1186/1472-6947-8-6.CrossRefPubMedPubMedCentral
Metadata
Title
Regression tree construction by bootstrap: Model search for DRG-systems applied to Austrian health-data
Authors
Thomas Grubinger
Conrad Kobel
Karl-Peter Pfeiffer
Publication date
01-12-2010
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2010
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/1472-6947-10-9

Other articles of this Issue 1/2010

BMC Medical Informatics and Decision Making 1/2010 Go to the issue