Top

BMC Medical Research Methodology

Published in:

Open Access 01-12-2015 | Research Article

Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data

Authors: Rachel C. Jinks, Patrick Royston, Mahesh KB Parmar

Published in: BMC Medical Research Methodology | Issue 1/2015

Abstract

Background

Prognostic studies of time-to-event data, where researchers aim to develop or validate multivariable prognostic models in order to predict survival, are commonly seen in the medical literature; however, most are performed retrospectively and few consider sample size prior to analysis. Events per variable rules are sometimes cited, but these are based on bias and coverage of confidence intervals for model terms, which are not of primary interest when developing a model to predict outcome. In this paper we aim to develop sample size recommendations for multivariable models of time-to-event data, based on their prognostic ability.

Methods

We derive formulae for determining the sample size required for multivariable prognostic models in time-to-event data, based on a measure of discrimination, D, developed by Royston and Sauerbrei. These formulae fall into two categories: either based on the significance of the value of D in a new study compared to a previous estimate, or based on the precision of the estimate of D in a new study in terms of confidence interval width. Using simulation we show that they give the desired power and type I error and are not affected by random censoring. Additionally, we conduct a literature review to collate published values of D in different disease areas.

Results

We illustrate our methods using parameters from a published prognostic study in liver cancer. The resulting sample sizes can be large, and we suggest controlling study size by expressing the desired accuracy in the new study as a relative value as well as an absolute value. To improve usability we use the values of D obtained from the literature review to develop an equation to approximately convert the commonly reported Harrell’s c-index to D. A flow chart is provided to aid decision making when using these methods.

Conclusion

We have developed a suite of sample size calculations based on the prognostic ability of a survival model, rather than the magnitude or significance of model coefficients. We have taken care to develop the practical utility of the calculations and give recommendations for their use in contemporary clinical research.

Moons KG, Royston P, Vergouwe Y, Grobbee DE, Altman DG. Prognosis and prognostic research: what, why, and how?BMJ. 2009; 338:1317–20.CrossRef

Mallett S, Royston P, Dutton S, Waters R, Altman DG. Reporting methods in studies developing prognostic models in cancer: a review. BMC Med. 2010; 8:20+.CrossRefPubMedPubMedCentral

Altman DG. Prognostic models: a methodological framework and review of models for breast cancer. Cancer Invest. 2009; 27(3):235–43.CrossRefPubMed

Altman DG, Lyman GH. Methodological challenges in the evaluation of prognostic factors in breast cancer. Breast Cancer Res Treat. 1998; 52(1-3):289–303.CrossRefPubMed

McGuire WL. Breast cancer prognostic factors: evaluation guidelines. J Natl Cancer Inst. 1991; 83(3):154–5.CrossRefPubMed

Royston P, Moons KG, Altman DG, Vergouwe Y. Prognosis and prognostic research: developing a prognostic model. BMJ. 2009; 338:1373–7.CrossRef

Riley RD, Hayden JA, Steyerberg EW, Moons KGM, Abrams K, Kyzas PA, et al. For the PROGRESS group: Prognosis research strategy (PROGRESS) 2: Prognostic factor research. PLoS Med. 2013; 10(2):e1001380+.CrossRefPubMedPubMedCentral

Schoenfeld DA. Sample-size formula for the proportional-hazards regression model. Biometrics. 1983; 39(2):499–503.CrossRefPubMed

Schmoor C, Sauerbrei W, Schumacher M. Sample size considerations for the evaluation of prognostic factors in survival analysis. Stat Med. 2000; 19(4):441–452.CrossRefPubMed

10.

Bernardo MVP, Lipsitz SR, Harrington DP, Catalano PJ. Sample size calculations for failure time random variables in non-randomized studies. J R Stat Soc (Series D): The Statistician. 2000; 49:31–40.CrossRef

11.

Hsieh F, Lavori PW. Sample-size calculations for the Cox proportional hazards regression model with nonbinary covariates. Control Clin Trials. 2000; 21(6):552–60.CrossRefPubMed

12.

Concato J, Peduzzi P, Holford TR, Feinstein AR. Importance of events per independent variable in proportional hazards analysis. I. Background, goals, and general strategy. J Clin Epidemiol. 1995; 48(12):1495–1501.CrossRefPubMed

13.

Peduzzi P, Concato J, Feinstein AR, Holford TR. Importance of events per independent variable in proportional hazards regression analysis. II. Accuracy and precision of regression estimates. J Clin Epidemiol. 1995; 48(12):1503–10.CrossRefPubMed

14.

Vittinghoff E, McCulloch CE. Relaxing the rule of ten events per variable in logistic and Cox regression. Am J Epidemiol. 2007; 165(6):710–8.CrossRefPubMed

15.

Copas JB. Regression, prediction and shrinkage. J R Stat Soc Ser B Methodol. 1983; 45(3):311–54.

16.

Smith LR, Harrell FE, Muhlbaier LH. Problems and potentials in modeling survival. In: Grady ML, Schwartz HA, editors. Medical Effectiveness Research Data Methods (Summary Report) AHCPR publication, no. 92-0056. US Dept of Health and Human Services, Agency for Health Care Policy and Research: 1992. p. 151–159.

17.

Ambler G, Seaman S, Omar RZ. An evaluation of penalised survival methods for developing prognostic models with rare events. Stat Med. 2012; 31:1150–61.CrossRefPubMed

18.

Vergouwe Y, Steyerberg EW, Eijkemans MJ, Habbema JDF. Substantial effect sample sizes were required for external validation studies of predictive logistic regression models. J Clin Epidemiol. 2005; 58:475–83.CrossRefPubMed

19.

Royston P, Sauerbrei W. A new measure of prognostic separation in survival data. Stat Med. 2004; 23(5):723–48.CrossRefPubMed

20.

Choodari-Oskooei B, Royston P, Parmar MK. A simulation study of predictive ability measures in a survival model. Stat Med. 2012; 31(23):2627–43.CrossRefPubMed

21.

Harrell FE, Lee KL, Califf RM, Pryor DB, Rosati RA. Regression modelling strategies for improved prognostic prediction. Stat Med. 1984; 3(2):143–52.CrossRefPubMed

22.

Gönen M, Heller G. Concordance probability and discriminatory power in proportional hazards regression. Biometrika. 2005; 92(4):965–70.CrossRef

23.

Jinks RC. Sample size for multivariable prognostic models: PhD thesis, University College London; 2012.

24.

Kent JT, O’Quigley J. Measures of dependence for censored survival data. Biometrika. 1988; 75(3):525–34.CrossRef

25.

Royston P. Explained variation for survival models. Stata J. 2006; 6:1–14.

26.

Armitage P, Berry G, Matthews JN. Statistical Methods in Medical Research, 4th ed. Oxford: Blackwell Science; 2001.

27.

Volinsky CT, Raftery AE. Bayesian Information Criterion for Censored Survival Models. Biometrics. 2000; 56:256–62.CrossRefPubMed

28.

Collette S, Bonnetain F, Paoletti X, Doffoel M, Bouché O, Raoul JL, et al. Prognosis of advanced hepatocellular carcinoma: comparison of three staging systems in two French clinical trials. Ann Oncol. 2008; 19(6):1117–26.CrossRefPubMed

29.

Vergouwe Y, Moons KGM, Steyerberg EW. External validity of risk models: use of benchmark values to disentangle a case-mix Effect from incorrect coefficients. Am J Epidemiol. 2010; 172(2):971–80.CrossRefPubMedPubMedCentral

30.

Steyerberg EW. Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating (Statistics for Biology and Health), 1st ed.: Springer; 2008.

31.

Bender R, Augustin T, Blettner M. Generating survival times to simulate Cox proportional hazards models. Stat Med. 2005; 24(11):1713–23.CrossRefPubMed

Title: Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data
Authors: Rachel C. Jinks
Patrick Royston
Mahesh KB Parmar
Publication date: 01-12-2015
Publisher: BioMed Central
Published in: BMC Medical Research Methodology / Issue 1/2015
Electronic ISSN: 1471-2288
DOI: https://doi.org/10.1186/s12874-015-0078-y

Keynote webinar | Spotlight on medication adherence

Springer Medicine

Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data

Abstract

Background

Methods

Results

Conclusion

Keynote webinar | Spotlight on medication adherence

Springer Medicine

Abstract

Background

Methods

Results

Conclusion

Please log in to get access to this content

Other articles of this Issue 1/2015

Pre-notification letter type and response rate to a postal survey among women who have recently given birth

The extension of total gain (TG) statistic in survival models: properties and applications

Development, inter-rater reliability and feasibility of a checklist to assess implementation (Ch-IMP) in systematic reviews: the case of provider-based prevention and treatment programs targeting children and youth

Decomposing the heterogeneity of depression at the person-, symptom-, and time-level: latent variable models versus multimode principal component analysis

Novel citation-based search method for scientific literature: application to meta-analyses

The need to balance merits and limitations from different disciplines when considering the stepped wedge cluster randomized trial design