Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2010

Open Access 01-12-2010 | Research article

The search for stable prognostic models in multiple imputed data sets

Authors: David Vergouw, Martijn W Heymans, George M Peat, Ton Kuijpers, Peter R Croft, Henrica CW de Vet, Henriëtte E van der Horst, Daniëlle AWM van der Windt

Published in: BMC Medical Research Methodology | Issue 1/2010

Login to get access

Abstract

Background

In prognostic studies model instability and missing data can be troubling factors. Proposed methods for handling these situations are bootstrapping (B) and Multiple imputation (MI). The authors examined the influence of these methods on model composition.

Methods

Models were constructed using a cohort of 587 patients consulting between January 2001 and January 2003 with a shoulder problem in general practice in the Netherlands (the Dutch Shoulder Study). Outcome measures were persistent shoulder disability and persistent shoulder pain. Potential predictors included socio-demographic variables, characteristics of the pain problem, physical activity and psychosocial factors. Model composition and performance (calibration and discrimination) were assessed for models using a complete case analysis, MI, bootstrapping or both MI and bootstrapping.

Results

Results showed that model composition varied between models as a result of how missing data was handled and that bootstrapping provided additional information on the stability of the selected prognostic model.

Conclusion

In prognostic modeling missing data needs to be handled by MI and bootstrap model selection is advised in order to provide information on model stability.
Literature
1.
go back to reference Schiøttz-Christensen B, Nielsen GL, Hansen VK, Schødt T, Sørensen HT, Olesen F: Long-term prognosis of acute low back pain in patients seen in general practice: a 1-year prospective follow-up study. Fam Pract. 1999, 16: 223-32. 10.1093/fampra/16.3.223.CrossRefPubMed Schiøttz-Christensen B, Nielsen GL, Hansen VK, Schødt T, Sørensen HT, Olesen F: Long-term prognosis of acute low back pain in patients seen in general practice: a 1-year prospective follow-up study. Fam Pract. 1999, 16: 223-32. 10.1093/fampra/16.3.223.CrossRefPubMed
2.
go back to reference Jellema P, van der Windt DA, van der Horst HE, Stalman WA, Bouter LM: Prediction of an unfavourable course of low back pain in general practice: comparison of four instruments. Br J Gen Pract. 2007, 57: 15-22.PubMedPubMedCentral Jellema P, van der Windt DA, van der Horst HE, Stalman WA, Bouter LM: Prediction of an unfavourable course of low back pain in general practice: comparison of four instruments. Br J Gen Pract. 2007, 57: 15-22.PubMedPubMedCentral
3.
go back to reference Heymans MW, Anema JR, van Buuren S, Knol DL, van Mechelen W, de Vet HC: Return to work in a cohort of low back pain patients: development and validation of a clinical prediction rule. J Occup Rehabil. 2009, 19: 155-65. 10.1007/s10926-009-9166-3.CrossRefPubMed Heymans MW, Anema JR, van Buuren S, Knol DL, van Mechelen W, de Vet HC: Return to work in a cohort of low back pain patients: development and validation of a clinical prediction rule. J Occup Rehabil. 2009, 19: 155-65. 10.1007/s10926-009-9166-3.CrossRefPubMed
4.
go back to reference Dionne CE, Bourbonnais R, Frémont P, Rossignol M, Stock SR, Larocque I: A clinical return-to-work rule for patients with back pain. CMAJ. 2005, 172: 1559-67.CrossRefPubMedPubMedCentral Dionne CE, Bourbonnais R, Frémont P, Rossignol M, Stock SR, Larocque I: A clinical return-to-work rule for patients with back pain. CMAJ. 2005, 172: 1559-67.CrossRefPubMedPubMedCentral
5.
go back to reference Thomas E, Dunn KM, Mallen C, Peat G: A prognostic approach to defining chronic pain: Application to knee pain in older adults. Pain. 2008, 139: 389-97. 10.1016/j.pain.2008.05.010.CrossRefPubMed Thomas E, Dunn KM, Mallen C, Peat G: A prognostic approach to defining chronic pain: Application to knee pain in older adults. Pain. 2008, 139: 389-97. 10.1016/j.pain.2008.05.010.CrossRefPubMed
6.
go back to reference Kuijpers T, van der Windt DA, Boeke AJ, Twisk JWR, Vergouwe Y, Bouter LM, van der Heijden GJMG: Clinical prediction rules for the prognosis of shoulder pain in general practice. Pain. 2006, 120: 276-85. 10.1016/j.pain.2005.11.004.CrossRefPubMed Kuijpers T, van der Windt DA, Boeke AJ, Twisk JWR, Vergouwe Y, Bouter LM, van der Heijden GJMG: Clinical prediction rules for the prognosis of shoulder pain in general practice. Pain. 2006, 120: 276-85. 10.1016/j.pain.2005.11.004.CrossRefPubMed
7.
go back to reference Austin PC, Tu JV: Automated variable selection methods for logistic regression produced unstable models for predicting acute myocardial infarction mortality. J Clin Epidemiol. 2004, 57: 1138-46. 10.1016/j.jclinepi.2004.04.003.CrossRefPubMed Austin PC, Tu JV: Automated variable selection methods for logistic regression produced unstable models for predicting acute myocardial infarction mortality. J Clin Epidemiol. 2004, 57: 1138-46. 10.1016/j.jclinepi.2004.04.003.CrossRefPubMed
8.
go back to reference Sauerbrei W: The use of resampling methods to simplify regression models in medical statistics. Applied Statistics. 1999, 48: 313-329. Sauerbrei W: The use of resampling methods to simplify regression models in medical statistics. Applied Statistics. 1999, 48: 313-329.
9.
go back to reference Sauerbrei W, Schumacher M: A bootstrap resampling procedure for model building: application to the Cox regression model. Stat Med. 1992, 11: 2093-109. --- Either ISSN or Journal title must be supplied.. Sauerbrei W, Schumacher M: A bootstrap resampling procedure for model building: application to the Cox regression model. Stat Med. 1992, 11: 2093-109. --- Either ISSN or Journal title must be supplied..
10.
go back to reference Royston P, Sauerbrei W: Stability of multivariable fractional polynomial models with selection of variables and transformations: a bootstrap investigation. Stat Med. 2003, 22: 639-59. 10.1002/sim.1310.CrossRefPubMed Royston P, Sauerbrei W: Stability of multivariable fractional polynomial models with selection of variables and transformations: a bootstrap investigation. Stat Med. 2003, 22: 639-59. 10.1002/sim.1310.CrossRefPubMed
11.
go back to reference Augustin NH, Sauerbrei W, Schumacher M: The practical utility of incorporating model selection uncertainty into prognostic models for survival data. Statistical Modelling. 2005, 5: 95-118. 10.1191/1471082X05st089oa.CrossRef Augustin NH, Sauerbrei W, Schumacher M: The practical utility of incorporating model selection uncertainty into prognostic models for survival data. Statistical Modelling. 2005, 5: 95-118. 10.1191/1471082X05st089oa.CrossRef
12.
go back to reference Donders AR, van der Heijden GJ, Stijnen T, Moons KG: Review: a gentle introduction to imputation of missing values. J Clin Epidemiol. 2006, 59: 1087-91. 10.1016/j.jclinepi.2006.01.014.CrossRefPubMed Donders AR, van der Heijden GJ, Stijnen T, Moons KG: Review: a gentle introduction to imputation of missing values. J Clin Epidemiol. 2006, 59: 1087-91. 10.1016/j.jclinepi.2006.01.014.CrossRefPubMed
13.
go back to reference Moons KG, Donders RA, Stijnen T, Harrell FE: Using the outcome for imputation of missing predictor values was preferred. J Clin Epidemiol. 2006, 59: 1092-101. 10.1016/j.jclinepi.2006.01.009.CrossRefPubMed Moons KG, Donders RA, Stijnen T, Harrell FE: Using the outcome for imputation of missing predictor values was preferred. J Clin Epidemiol. 2006, 59: 1092-101. 10.1016/j.jclinepi.2006.01.009.CrossRefPubMed
14.
go back to reference Heymans MW, van Buuren S, Knol DL, van Mechelen W, de Vet HC: Variable selection under multiple imputation using the bootstrap in a prognostic study. BMC Med Res Methodol. 2007, 7: 33-10.1186/1471-2288-7-33.CrossRefPubMedPubMedCentral Heymans MW, van Buuren S, Knol DL, van Mechelen W, de Vet HC: Variable selection under multiple imputation using the bootstrap in a prognostic study. BMC Med Res Methodol. 2007, 7: 33-10.1186/1471-2288-7-33.CrossRefPubMedPubMedCentral
15.
go back to reference Kuijpers T, van der Heijden GJMG, Vergouwe Y, Twist JWR, Boeke AJP, Bouter LM, van der Windt DAWM: Good generalizability of a prediction rule for prediction of persistent shoulder pain in the short term. J Clin Epidemiol. 2007, 60: 947-53. 10.1016/j.jclinepi.2006.11.015.CrossRefPubMed Kuijpers T, van der Heijden GJMG, Vergouwe Y, Twist JWR, Boeke AJP, Bouter LM, van der Windt DAWM: Good generalizability of a prediction rule for prediction of persistent shoulder pain in the short term. J Clin Epidemiol. 2007, 60: 947-53. 10.1016/j.jclinepi.2006.11.015.CrossRefPubMed
16.
go back to reference Van der Heijden GJ, Leffers P, Bouter LM: Shoulder disability questionnaire design and responsiveness of a functional status measure. J Clin Epidemiol. 2000, 53: 29-38. 10.1016/S0895-4356(99)00078-5.CrossRefPubMed Van der Heijden GJ, Leffers P, Bouter LM: Shoulder disability questionnaire design and responsiveness of a functional status measure. J Clin Epidemiol. 2000, 53: 29-38. 10.1016/S0895-4356(99)00078-5.CrossRefPubMed
17.
go back to reference Van der Windt DA, Koes BW, Devillé W, Boeke AJ, de Jong BA, Bouter LM: Effectiveness of corticosteroid injections versus physiotherapy for treatment of painful stiff shoulder in primary care: randomised trial. BMJ. 1998, 317: 1292-6.CrossRefPubMedPubMedCentral Van der Windt DA, Koes BW, Devillé W, Boeke AJ, de Jong BA, Bouter LM: Effectiveness of corticosteroid injections versus physiotherapy for treatment of painful stiff shoulder in primary care: randomised trial. BMJ. 1998, 317: 1292-6.CrossRefPubMedPubMedCentral
18.
go back to reference Van der Roer N, Ostelo RW, Bekkering GE, van Tulder MW, de Vet HC: Minimal clinically important change for pain intensity, functional status, and general health status in patients with nonspecific low back pain. Spine. 2006, 31: 578-582. 10.1097/01.brs.0000201293.57439.47.CrossRefPubMed Van der Roer N, Ostelo RW, Bekkering GE, van Tulder MW, de Vet HC: Minimal clinically important change for pain intensity, functional status, and general health status in patients with nonspecific low back pain. Spine. 2006, 31: 578-582. 10.1097/01.brs.0000201293.57439.47.CrossRefPubMed
19.
go back to reference Kuijpers T, van der Windt DA, van der Heijden GJ, Bouter LM: Systematic review of prognostic cohort studies on shoulder disorders. Pain. 2004, 109: 420-31. 10.1016/j.pain.2004.02.017.CrossRefPubMed Kuijpers T, van der Windt DA, van der Heijden GJ, Bouter LM: Systematic review of prognostic cohort studies on shoulder disorders. Pain. 2004, 109: 420-31. 10.1016/j.pain.2004.02.017.CrossRefPubMed
20.
go back to reference Berg SGM, Vlaeyen JWS, Ter Kuil MM, Spinhoven P, van Breukelen G, Kole-Snijders AMJ: Instruments for measuring chronic pain, part 2. Pain Coping and Cognition List. Dutch: Meetinstrument chronische pijn, deel 2. Pijn Coping Cognitie Lijst. 2001, Maastricht: Pijn Kennis Centrum Berg SGM, Vlaeyen JWS, Ter Kuil MM, Spinhoven P, van Breukelen G, Kole-Snijders AMJ: Instruments for measuring chronic pain, part 2. Pain Coping and Cognition List. Dutch: Meetinstrument chronische pijn, deel 2. Pijn Coping Cognitie Lijst. 2001, Maastricht: Pijn Kennis Centrum
21.
go back to reference Terluin B, van Rhenen W, Schaufeli W, de Haan M: The four-Dimensional symptom questionnaire (4DSQ): measuring distress and other mental health problems in a working population. Work & Stress. 2004, 18: 187-207.CrossRef Terluin B, van Rhenen W, Schaufeli W, de Haan M: The four-Dimensional symptom questionnaire (4DSQ): measuring distress and other mental health problems in a working population. Work & Stress. 2004, 18: 187-207.CrossRef
22.
go back to reference Waddell G, Newton M, Henderson I, Somerville D, Main CJ: A Fear-Avoidance Beliefs Questionnaire (FABQ) and the role of fear-avoidance beliefs in chronic low back pain and disability. Pain. 1993, 52: 157-68. 10.1016/0304-3959(93)90127-B.CrossRefPubMed Waddell G, Newton M, Henderson I, Somerville D, Main CJ: A Fear-Avoidance Beliefs Questionnaire (FABQ) and the role of fear-avoidance beliefs in chronic low back pain and disability. Pain. 1993, 52: 157-68. 10.1016/0304-3959(93)90127-B.CrossRefPubMed
23.
go back to reference Kori SH, Miller RP, Todd DD: Kinesiophobia: A new view of chronic pain behaviour. Pain Management. 1990, 35-43. Kori SH, Miller RP, Todd DD: Kinesiophobia: A new view of chronic pain behaviour. Pain Management. 1990, 35-43.
24.
go back to reference Vlaeyen JW, Seelen HA, Peters M, de Jong P, Aretz E, Beisiegel E, Weber WEJ: Fear of movement/(re)injury and muscular reactivity in chronic low back pain patients: an experimental investigation. Pain. 1999, 82: 297-304. 10.1016/S0304-3959(99)00054-8.CrossRefPubMed Vlaeyen JW, Seelen HA, Peters M, de Jong P, Aretz E, Beisiegel E, Weber WEJ: Fear of movement/(re)injury and muscular reactivity in chronic low back pain patients: an experimental investigation. Pain. 1999, 82: 297-304. 10.1016/S0304-3959(99)00054-8.CrossRefPubMed
25.
go back to reference Royston P, Altman DG, Sauerbrei W: Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med. 2006, 25: 127-41. 10.1002/sim.2331.CrossRefPubMed Royston P, Altman DG, Sauerbrei W: Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med. 2006, 25: 127-41. 10.1002/sim.2331.CrossRefPubMed
26.
go back to reference Slinker BK, Glantz SA: Multiple regression for physiological data analysis: the problem of multicollinearity. American Journal of Physiology. 1985, 249: R1-R12.PubMed Slinker BK, Glantz SA: Multiple regression for physiological data analysis: the problem of multicollinearity. American Journal of Physiology. 1985, 249: R1-R12.PubMed
27.
go back to reference Akaike H: sor 2nd Int. Symp. on Information Theory. Edited by: Petrov BN, Csaki F. 1973, Budapest: Akademiai Kiado, 267-81. Bozdogan H 1987 Psychometrika 52 345-70 Akaike H: sor 2nd Int. Symp. on Information Theory. Edited by: Petrov BN, Csaki F. 1973, Budapest: Akademiai Kiado, 267-81. Bozdogan H 1987 Psychometrika 52 345-70
28.
go back to reference Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR: A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol. 1996, 49: 1373-9. 10.1016/S0895-4356(96)00236-3.CrossRefPubMed Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR: A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol. 1996, 49: 1373-9. 10.1016/S0895-4356(96)00236-3.CrossRefPubMed
31.
go back to reference Meng X, Rubin DB: Performing likelihood ratio tests with multiply-imputed data sets. Biometrika. 1992, 79: 103-111. 10.1093/biomet/79.1.103.CrossRef Meng X, Rubin DB: Performing likelihood ratio tests with multiply-imputed data sets. Biometrika. 1992, 79: 103-111. 10.1093/biomet/79.1.103.CrossRef
32.
go back to reference Harrell FE, Lee KL, Mark DB: Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996, 15: 361-87. 10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4.CrossRefPubMed Harrell FE, Lee KL, Mark DB: Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996, 15: 361-87. 10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4.CrossRefPubMed
33.
go back to reference Steyerberg EW, Harrell FE, Borsboom GJ, Eijkemans MJ, Vergouwe Y, Habbema JD: Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. J Clin Epidemiol. 2001, 54: 774-81. 10.1016/S0895-4356(01)00341-9.CrossRefPubMed Steyerberg EW, Harrell FE, Borsboom GJ, Eijkemans MJ, Vergouwe Y, Habbema JD: Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. J Clin Epidemiol. 2001, 54: 774-81. 10.1016/S0895-4356(01)00341-9.CrossRefPubMed
34.
go back to reference Harell FE, Califf RM, Pryor DB, Lee KL, Rosati RA: Evaluating the yield of medical tests'. Journal of the American Medical Association. 1982, 247: 2543-6. 10.1001/jama.247.18.2543.CrossRef Harell FE, Califf RM, Pryor DB, Lee KL, Rosati RA: Evaluating the yield of medical tests'. Journal of the American Medical Association. 1982, 247: 2543-6. 10.1001/jama.247.18.2543.CrossRef
35.
go back to reference Austin PC, Tu JV: Bootstrap methods for developing predictive models. The American Statistician. 2004, 58: 131-7. 10.1198/0003130043277.CrossRef Austin PC, Tu JV: Bootstrap methods for developing predictive models. The American Statistician. 2004, 58: 131-7. 10.1198/0003130043277.CrossRef
36.
go back to reference Beyene J, Atenafu EG, Hamid JS, To T, Sung LL: Determining relative importance of variables in developing and validating predictive models. BMC Med Res Methodol. 2009, 9: 64-10.1186/1471-2288-9-64.CrossRefPubMedPubMedCentral Beyene J, Atenafu EG, Hamid JS, To T, Sung LL: Determining relative importance of variables in developing and validating predictive models. BMC Med Res Methodol. 2009, 9: 64-10.1186/1471-2288-9-64.CrossRefPubMedPubMedCentral
37.
go back to reference Austin PC: Bootstrap model selection had similar performance for selecting authentic and noise variables compared to backward variable elimination: a simulation study. J Clin Epidemiol. 2008, 61: 1009-17. 10.1016/j.jclinepi.2007.11.014.CrossRefPubMed Austin PC: Bootstrap model selection had similar performance for selecting authentic and noise variables compared to backward variable elimination: a simulation study. J Clin Epidemiol. 2008, 61: 1009-17. 10.1016/j.jclinepi.2007.11.014.CrossRefPubMed
38.
go back to reference Crawford SL, Tennstedt SL, McKinlay JB: A comparison of analytic methods for non-random missingness of outcome data. J Clin Epidemiol. 1995, 48: 209-19. 10.1016/0895-4356(94)00124-9.CrossRefPubMed Crawford SL, Tennstedt SL, McKinlay JB: A comparison of analytic methods for non-random missingness of outcome data. J Clin Epidemiol. 1995, 48: 209-19. 10.1016/0895-4356(94)00124-9.CrossRefPubMed
39.
go back to reference Clark TG, Altman DG: Developing a prognostic model in the presence of missing data: an ovarian cancer case study. J Clin Epidemiol. 2003, 56: 28-37. 10.1016/S0895-4356(02)00539-5.CrossRefPubMed Clark TG, Altman DG: Developing a prognostic model in the presence of missing data: an ovarian cancer case study. J Clin Epidemiol. 2003, 56: 28-37. 10.1016/S0895-4356(02)00539-5.CrossRefPubMed
40.
go back to reference Van der Heijden GJ, Donders AR, Stijnen T, Moons KG: Imputation of missing values is superior to complete case analysis and the missing-indicator method in multivariable diagnostic research: a clinical example. J Clin Epidemiol. 2006, 59: 1102-9. 10.1016/j.jclinepi.2006.01.015.CrossRefPubMed Van der Heijden GJ, Donders AR, Stijnen T, Moons KG: Imputation of missing values is superior to complete case analysis and the missing-indicator method in multivariable diagnostic research: a clinical example. J Clin Epidemiol. 2006, 59: 1102-9. 10.1016/j.jclinepi.2006.01.015.CrossRefPubMed
41.
go back to reference Ambler G, Omar RZ, Royston P: A comparison of imputation techniques for handling missing predictor values in a risk model with a binary outcome. Stat Methods Med Res. 2007, 16: 277-98. 10.1177/0962280206074466.CrossRefPubMed Ambler G, Omar RZ, Royston P: A comparison of imputation techniques for handling missing predictor values in a risk model with a binary outcome. Stat Methods Med Res. 2007, 16: 277-98. 10.1177/0962280206074466.CrossRefPubMed
42.
go back to reference Wood AM, White IR, Royston P: How should variable selection be performed with multiply imputed data?. Stat Med. 2008, 27: 3227-46. 10.1002/sim.3177.CrossRefPubMed Wood AM, White IR, Royston P: How should variable selection be performed with multiply imputed data?. Stat Med. 2008, 27: 3227-46. 10.1002/sim.3177.CrossRefPubMed
Metadata
Title
The search for stable prognostic models in multiple imputed data sets
Authors
David Vergouw
Martijn W Heymans
George M Peat
Ton Kuijpers
Peter R Croft
Henrica CW de Vet
Henriëtte E van der Horst
Daniëlle AWM van der Windt
Publication date
01-12-2010
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2010
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/1471-2288-10-81

Other articles of this Issue 1/2010

BMC Medical Research Methodology 1/2010 Go to the issue