Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2015

Open Access 01-12-2015 | Research Article

Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data

Authors: Rachel C. Jinks, Patrick Royston, Mahesh KB Parmar

Published in: BMC Medical Research Methodology | Issue 1/2015

Login to get access

Abstract

Background

Prognostic studies of time-to-event data, where researchers aim to develop or validate multivariable prognostic models in order to predict survival, are commonly seen in the medical literature; however, most are performed retrospectively and few consider sample size prior to analysis. Events per variable rules are sometimes cited, but these are based on bias and coverage of confidence intervals for model terms, which are not of primary interest when developing a model to predict outcome. In this paper we aim to develop sample size recommendations for multivariable models of time-to-event data, based on their prognostic ability.

Methods

We derive formulae for determining the sample size required for multivariable prognostic models in time-to-event data, based on a measure of discrimination, D, developed by Royston and Sauerbrei. These formulae fall into two categories: either based on the significance of the value of D in a new study compared to a previous estimate, or based on the precision of the estimate of D in a new study in terms of confidence interval width. Using simulation we show that they give the desired power and type I error and are not affected by random censoring. Additionally, we conduct a literature review to collate published values of D in different disease areas.

Results

We illustrate our methods using parameters from a published prognostic study in liver cancer. The resulting sample sizes can be large, and we suggest controlling study size by expressing the desired accuracy in the new study as a relative value as well as an absolute value. To improve usability we use the values of D obtained from the literature review to develop an equation to approximately convert the commonly reported Harrell’s c-index to D. A flow chart is provided to aid decision making when using these methods.

Conclusion

We have developed a suite of sample size calculations based on the prognostic ability of a survival model, rather than the magnitude or significance of model coefficients. We have taken care to develop the practical utility of the calculations and give recommendations for their use in contemporary clinical research.
Literature
1.
go back to reference Moons KG, Royston P, Vergouwe Y, Grobbee DE, Altman DG. Prognosis and prognostic research: what, why, and how?BMJ. 2009; 338:1317–20.CrossRef Moons KG, Royston P, Vergouwe Y, Grobbee DE, Altman DG. Prognosis and prognostic research: what, why, and how?BMJ. 2009; 338:1317–20.CrossRef
2.
go back to reference Mallett S, Royston P, Dutton S, Waters R, Altman DG. Reporting methods in studies developing prognostic models in cancer: a review. BMC Med. 2010; 8:20+.CrossRefPubMedPubMedCentral Mallett S, Royston P, Dutton S, Waters R, Altman DG. Reporting methods in studies developing prognostic models in cancer: a review. BMC Med. 2010; 8:20+.CrossRefPubMedPubMedCentral
3.
go back to reference Altman DG. Prognostic models: a methodological framework and review of models for breast cancer. Cancer Invest. 2009; 27(3):235–43.CrossRefPubMed Altman DG. Prognostic models: a methodological framework and review of models for breast cancer. Cancer Invest. 2009; 27(3):235–43.CrossRefPubMed
4.
go back to reference Altman DG, Lyman GH. Methodological challenges in the evaluation of prognostic factors in breast cancer. Breast Cancer Res Treat. 1998; 52(1-3):289–303.CrossRefPubMed Altman DG, Lyman GH. Methodological challenges in the evaluation of prognostic factors in breast cancer. Breast Cancer Res Treat. 1998; 52(1-3):289–303.CrossRefPubMed
5.
go back to reference McGuire WL. Breast cancer prognostic factors: evaluation guidelines. J Natl Cancer Inst. 1991; 83(3):154–5.CrossRefPubMed McGuire WL. Breast cancer prognostic factors: evaluation guidelines. J Natl Cancer Inst. 1991; 83(3):154–5.CrossRefPubMed
6.
go back to reference Royston P, Moons KG, Altman DG, Vergouwe Y. Prognosis and prognostic research: developing a prognostic model. BMJ. 2009; 338:1373–7.CrossRef Royston P, Moons KG, Altman DG, Vergouwe Y. Prognosis and prognostic research: developing a prognostic model. BMJ. 2009; 338:1373–7.CrossRef
7.
go back to reference Riley RD, Hayden JA, Steyerberg EW, Moons KGM, Abrams K, Kyzas PA, et al. For the PROGRESS group: Prognosis research strategy (PROGRESS) 2: Prognostic factor research. PLoS Med. 2013; 10(2):e1001380+.CrossRefPubMedPubMedCentral Riley RD, Hayden JA, Steyerberg EW, Moons KGM, Abrams K, Kyzas PA, et al. For the PROGRESS group: Prognosis research strategy (PROGRESS) 2: Prognostic factor research. PLoS Med. 2013; 10(2):e1001380+.CrossRefPubMedPubMedCentral
8.
go back to reference Schoenfeld DA. Sample-size formula for the proportional-hazards regression model. Biometrics. 1983; 39(2):499–503.CrossRefPubMed Schoenfeld DA. Sample-size formula for the proportional-hazards regression model. Biometrics. 1983; 39(2):499–503.CrossRefPubMed
9.
go back to reference Schmoor C, Sauerbrei W, Schumacher M. Sample size considerations for the evaluation of prognostic factors in survival analysis. Stat Med. 2000; 19(4):441–452.CrossRefPubMed Schmoor C, Sauerbrei W, Schumacher M. Sample size considerations for the evaluation of prognostic factors in survival analysis. Stat Med. 2000; 19(4):441–452.CrossRefPubMed
10.
go back to reference Bernardo MVP, Lipsitz SR, Harrington DP, Catalano PJ. Sample size calculations for failure time random variables in non-randomized studies. J R Stat Soc (Series D): The Statistician. 2000; 49:31–40.CrossRef Bernardo MVP, Lipsitz SR, Harrington DP, Catalano PJ. Sample size calculations for failure time random variables in non-randomized studies. J R Stat Soc (Series D): The Statistician. 2000; 49:31–40.CrossRef
11.
go back to reference Hsieh F, Lavori PW. Sample-size calculations for the Cox proportional hazards regression model with nonbinary covariates. Control Clin Trials. 2000; 21(6):552–60.CrossRefPubMed Hsieh F, Lavori PW. Sample-size calculations for the Cox proportional hazards regression model with nonbinary covariates. Control Clin Trials. 2000; 21(6):552–60.CrossRefPubMed
12.
go back to reference Concato J, Peduzzi P, Holford TR, Feinstein AR. Importance of events per independent variable in proportional hazards analysis. I. Background, goals, and general strategy. J Clin Epidemiol. 1995; 48(12):1495–1501.CrossRefPubMed Concato J, Peduzzi P, Holford TR, Feinstein AR. Importance of events per independent variable in proportional hazards analysis. I. Background, goals, and general strategy. J Clin Epidemiol. 1995; 48(12):1495–1501.CrossRefPubMed
13.
go back to reference Peduzzi P, Concato J, Feinstein AR, Holford TR. Importance of events per independent variable in proportional hazards regression analysis. II. Accuracy and precision of regression estimates. J Clin Epidemiol. 1995; 48(12):1503–10.CrossRefPubMed Peduzzi P, Concato J, Feinstein AR, Holford TR. Importance of events per independent variable in proportional hazards regression analysis. II. Accuracy and precision of regression estimates. J Clin Epidemiol. 1995; 48(12):1503–10.CrossRefPubMed
14.
go back to reference Vittinghoff E, McCulloch CE. Relaxing the rule of ten events per variable in logistic and Cox regression. Am J Epidemiol. 2007; 165(6):710–8.CrossRefPubMed Vittinghoff E, McCulloch CE. Relaxing the rule of ten events per variable in logistic and Cox regression. Am J Epidemiol. 2007; 165(6):710–8.CrossRefPubMed
15.
go back to reference Copas JB. Regression, prediction and shrinkage. J R Stat Soc Ser B Methodol. 1983; 45(3):311–54. Copas JB. Regression, prediction and shrinkage. J R Stat Soc Ser B Methodol. 1983; 45(3):311–54.
16.
go back to reference Smith LR, Harrell FE, Muhlbaier LH. Problems and potentials in modeling survival. In: Grady ML, Schwartz HA, editors. Medical Effectiveness Research Data Methods (Summary Report) AHCPR publication, no. 92-0056. US Dept of Health and Human Services, Agency for Health Care Policy and Research: 1992. p. 151–159. Smith LR, Harrell FE, Muhlbaier LH. Problems and potentials in modeling survival. In: Grady ML, Schwartz HA, editors. Medical Effectiveness Research Data Methods (Summary Report) AHCPR publication, no. 92-0056. US Dept of Health and Human Services, Agency for Health Care Policy and Research: 1992. p. 151–159.
17.
go back to reference Ambler G, Seaman S, Omar RZ. An evaluation of penalised survival methods for developing prognostic models with rare events. Stat Med. 2012; 31:1150–61.CrossRefPubMed Ambler G, Seaman S, Omar RZ. An evaluation of penalised survival methods for developing prognostic models with rare events. Stat Med. 2012; 31:1150–61.CrossRefPubMed
18.
go back to reference Vergouwe Y, Steyerberg EW, Eijkemans MJ, Habbema JDF. Substantial effect sample sizes were required for external validation studies of predictive logistic regression models. J Clin Epidemiol. 2005; 58:475–83.CrossRefPubMed Vergouwe Y, Steyerberg EW, Eijkemans MJ, Habbema JDF. Substantial effect sample sizes were required for external validation studies of predictive logistic regression models. J Clin Epidemiol. 2005; 58:475–83.CrossRefPubMed
19.
go back to reference Royston P, Sauerbrei W. A new measure of prognostic separation in survival data. Stat Med. 2004; 23(5):723–48.CrossRefPubMed Royston P, Sauerbrei W. A new measure of prognostic separation in survival data. Stat Med. 2004; 23(5):723–48.CrossRefPubMed
20.
go back to reference Choodari-Oskooei B, Royston P, Parmar MK. A simulation study of predictive ability measures in a survival model. Stat Med. 2012; 31(23):2627–43.CrossRefPubMed Choodari-Oskooei B, Royston P, Parmar MK. A simulation study of predictive ability measures in a survival model. Stat Med. 2012; 31(23):2627–43.CrossRefPubMed
21.
go back to reference Harrell FE, Lee KL, Califf RM, Pryor DB, Rosati RA. Regression modelling strategies for improved prognostic prediction. Stat Med. 1984; 3(2):143–52.CrossRefPubMed Harrell FE, Lee KL, Califf RM, Pryor DB, Rosati RA. Regression modelling strategies for improved prognostic prediction. Stat Med. 1984; 3(2):143–52.CrossRefPubMed
22.
go back to reference Gönen M, Heller G. Concordance probability and discriminatory power in proportional hazards regression. Biometrika. 2005; 92(4):965–70.CrossRef Gönen M, Heller G. Concordance probability and discriminatory power in proportional hazards regression. Biometrika. 2005; 92(4):965–70.CrossRef
23.
go back to reference Jinks RC. Sample size for multivariable prognostic models: PhD thesis, University College London; 2012. Jinks RC. Sample size for multivariable prognostic models: PhD thesis, University College London; 2012.
24.
go back to reference Kent JT, O’Quigley J. Measures of dependence for censored survival data. Biometrika. 1988; 75(3):525–34.CrossRef Kent JT, O’Quigley J. Measures of dependence for censored survival data. Biometrika. 1988; 75(3):525–34.CrossRef
25.
go back to reference Royston P. Explained variation for survival models. Stata J. 2006; 6:1–14. Royston P. Explained variation for survival models. Stata J. 2006; 6:1–14.
26.
go back to reference Armitage P, Berry G, Matthews JN. Statistical Methods in Medical Research, 4th ed. Oxford: Blackwell Science; 2001. Armitage P, Berry G, Matthews JN. Statistical Methods in Medical Research, 4th ed. Oxford: Blackwell Science; 2001.
27.
go back to reference Volinsky CT, Raftery AE. Bayesian Information Criterion for Censored Survival Models. Biometrics. 2000; 56:256–62.CrossRefPubMed Volinsky CT, Raftery AE. Bayesian Information Criterion for Censored Survival Models. Biometrics. 2000; 56:256–62.CrossRefPubMed
28.
go back to reference Collette S, Bonnetain F, Paoletti X, Doffoel M, Bouché O, Raoul JL, et al. Prognosis of advanced hepatocellular carcinoma: comparison of three staging systems in two French clinical trials. Ann Oncol. 2008; 19(6):1117–26.CrossRefPubMed Collette S, Bonnetain F, Paoletti X, Doffoel M, Bouché O, Raoul JL, et al. Prognosis of advanced hepatocellular carcinoma: comparison of three staging systems in two French clinical trials. Ann Oncol. 2008; 19(6):1117–26.CrossRefPubMed
29.
go back to reference Vergouwe Y, Moons KGM, Steyerberg EW. External validity of risk models: use of benchmark values to disentangle a case-mix Effect from incorrect coefficients. Am J Epidemiol. 2010; 172(2):971–80.CrossRefPubMedPubMedCentral Vergouwe Y, Moons KGM, Steyerberg EW. External validity of risk models: use of benchmark values to disentangle a case-mix Effect from incorrect coefficients. Am J Epidemiol. 2010; 172(2):971–80.CrossRefPubMedPubMedCentral
30.
go back to reference Steyerberg EW. Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating (Statistics for Biology and Health), 1st ed.: Springer; 2008. Steyerberg EW. Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating (Statistics for Biology and Health), 1st ed.: Springer; 2008.
31.
go back to reference Bender R, Augustin T, Blettner M. Generating survival times to simulate Cox proportional hazards models. Stat Med. 2005; 24(11):1713–23.CrossRefPubMed Bender R, Augustin T, Blettner M. Generating survival times to simulate Cox proportional hazards models. Stat Med. 2005; 24(11):1713–23.CrossRefPubMed
Metadata
Title
Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data
Authors
Rachel C. Jinks
Patrick Royston
Mahesh KB Parmar
Publication date
01-12-2015
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2015
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/s12874-015-0078-y

Other articles of this Issue 1/2015

BMC Medical Research Methodology 1/2015 Go to the issue