Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2017

Open Access 01-12-2017 | Research article

Robust estimation of the expected survival probabilities from high-dimensional Cox models with biomarker-by-treatment interactions in randomized clinical trials

Authors: Nils Ternès, Federico Rotolo, Stefan Michiels

Published in: BMC Medical Research Methodology | Issue 1/2017

Login to get access

Abstract

Background

Thanks to the advances in genomics and targeted treatments, more and more prediction models based on biomarkers are being developed to predict potential benefit from treatments in a randomized clinical trial. Despite the methodological framework for the development and validation of prediction models in a high-dimensional setting is getting more and more established, no clear guidance exists yet on how to estimate expected survival probabilities in a penalized model with biomarker-by-treatment interactions.

Methods

Based on a parsimonious biomarker selection in a penalized high-dimensional Cox model (lasso or adaptive lasso), we propose a unified framework to: estimate internally the predictive accuracy metrics of the developed model (using double cross-validation); estimate the individual survival probabilities at a given timepoint; construct confidence intervals thereof (analytical or bootstrap); and visualize them graphically (pointwise or smoothed with spline). We compared these strategies through a simulation study covering scenarios with or without biomarker effects. We applied the strategies to a large randomized phase III clinical trial that evaluated the effect of adding trastuzumab to chemotherapy in 1574 early breast cancer patients, for which the expression of 462 genes was measured.

Results

In our simulations, penalized regression models using the adaptive lasso estimated the survival probability of new patients with low bias and standard error; bootstrapped confidence intervals had empirical coverage probability close to the nominal level across very different scenarios. The double cross-validation performed on the training data set closely mimicked the predictive accuracy of the selected models in external validation data. We also propose a useful visual representation of the expected survival probabilities using splines. In the breast cancer trial, the adaptive lasso penalty selected a prediction model with 4 clinical covariates, the main effects of 98 biomarkers and 24 biomarker-by-treatment interactions, but there was high variability of the expected survival probabilities, with very large confidence intervals.

Conclusion

Based on our simulations, we propose a unified framework for: developing a prediction model with biomarker-by-treatment interactions in a high-dimensional setting and validating it in absence of external data; accurately estimating the expected survival probability of future patients with associated confidence intervals; and graphically visualizing the developed prediction model. All the methods are implemented in the R package biospear, publicly available on the CRAN.
Appendix
Available only for authorised users
Literature
1.
go back to reference Pogue-Geile KL, Kim C, Jeong J-H, Tanaka N, Bandos H, Gavin PG, et al. Predicting degree of benefit from adjuvant trastuzumab in NSABP trial B-31. J Natl Cancer Inst. 2013;105:1782–8.CrossRefPubMedPubMedCentral Pogue-Geile KL, Kim C, Jeong J-H, Tanaka N, Bandos H, Gavin PG, et al. Predicting degree of benefit from adjuvant trastuzumab in NSABP trial B-31. J Natl Cancer Inst. 2013;105:1782–8.CrossRefPubMedPubMedCentral
2.
go back to reference Perez EA, Thompson EA, Ballman KV, Anderson SK, Asmann YW, Kalari KR, et al. Genomic analysis reveals that immune function genes are strongly linked to clinical outcome in the North Central Cancer Treatment Group n9831 Adjuvant Trastuzumab Trial. J Clin Oncol. 2015;33:701–8.CrossRefPubMedPubMedCentral Perez EA, Thompson EA, Ballman KV, Anderson SK, Asmann YW, Kalari KR, et al. Genomic analysis reveals that immune function genes are strongly linked to clinical outcome in the North Central Cancer Treatment Group n9831 Adjuvant Trastuzumab Trial. J Clin Oncol. 2015;33:701–8.CrossRefPubMedPubMedCentral
3.
go back to reference Matsui S, Simon R, Qu P, Matsui S, Shaughnessy JD, Barlogie B, Crowley J. Developing and Validating Continuous Genomic Signatures in Randomized Clinical Trials for Predictive Medicine. Clin Cancer Res. 2012;18:6065–73.CrossRefPubMedPubMedCentral Matsui S, Simon R, Qu P, Matsui S, Shaughnessy JD, Barlogie B, Crowley J. Developing and Validating Continuous Genomic Signatures in Randomized Clinical Trials for Predictive Medicine. Clin Cancer Res. 2012;18:6065–73.CrossRefPubMedPubMedCentral
4.
go back to reference Yang H, Tang R, Hale M, Huang J. A visualization method measuring the performance of biomarkers for guiding treatment decisions. Pharm Stat. 2016;15:152–64.CrossRefPubMed Yang H, Tang R, Hale M, Huang J. A visualization method measuring the performance of biomarkers for guiding treatment decisions. Pharm Stat. 2016;15:152–64.CrossRefPubMed
5.
go back to reference Sinnott JA, Cai T. Inference for survival prediction under the regularized Cox model. Biostatistics 2016;17:692–707. Sinnott JA, Cai T. Inference for survival prediction under the regularized Cox model. Biostatistics 2016;17:692–707.
6.
go back to reference Lin C, Halabi S. A Simple Method for Deriving the Confidence Regions for the Penalized Cox’s Model via the Minimand Perturbation. Commun Stat - Theory Methods. 2016. Lin C, Halabi S. A Simple Method for Deriving the Confidence Regions for the Penalized Cox’s Model via the Minimand Perturbation. Commun Stat - Theory Methods. 2016.
7.
go back to reference Rothwell PM. Subgroup analysis in randomised controlled trials: importance, indications, and interpretation. Lancet. 2005;365:176–86.CrossRefPubMed Rothwell PM. Subgroup analysis in randomised controlled trials: importance, indications, and interpretation. Lancet. 2005;365:176–86.CrossRefPubMed
8.
go back to reference Windeler J. Prognosis - what does the clinician associate with this notion? Stat Med. 2000;19:425–30.CrossRefPubMed Windeler J. Prognosis - what does the clinician associate with this notion? Stat Med. 2000;19:425–30.CrossRefPubMed
9.
go back to reference Cox DR. Regression models and life-tables (with discussion). J R Stat Soc Ser B. 1972;34:187–220. Cox DR. Regression models and life-tables (with discussion). J R Stat Soc Ser B. 1972;34:187–220.
10.
go back to reference Ternès N, Rotolo F, Heinze G, Michiels S. Identification of biomarker-by-treatment interactions in randomized clinical trials with survival outcomes and high-dimensional spaces. Biometrical J. doi: 10.1002/bimj.201500234. [Epub ahead of print] Ternès N, Rotolo F, Heinze G, Michiels S. Identification of biomarker-by-treatment interactions in randomized clinical trials with survival outcomes and high-dimensional spaces. Biometrical J. doi: 10.​1002/​bimj.​201500234. [Epub ahead of print]
11.
go back to reference Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B. 1996;58:267–88. Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B. 1996;58:267–88.
12.
go back to reference Zou H. The adaptive lasso and its oracle properties. J Am Stat Assoc. 2006;101:1418–29.CrossRef Zou H. The adaptive lasso and its oracle properties. J Am Stat Assoc. 2006;101:1418–29.CrossRef
13.
go back to reference Zhang HH, Lu W. Adaptive lasso for Cox’s proportional hazards model. Biometrika. 2007;94:691–703.CrossRef Zhang HH, Lu W. Adaptive lasso for Cox’s proportional hazards model. Biometrika. 2007;94:691–703.CrossRef
14.
go back to reference Fan J, Li R. Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc. 2001;96:1348–60.CrossRef Fan J, Li R. Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc. 2001;96:1348–60.CrossRef
15.
go back to reference Verweij PJ, van Houwelingen HC. Cross-validation in survival analysis. Stat Med. 1993;12:2305–14.CrossRefPubMed Verweij PJ, van Houwelingen HC. Cross-validation in survival analysis. Stat Med. 1993;12:2305–14.CrossRefPubMed
16.
go back to reference Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33:1–22. Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33:1–22.
18.
go back to reference Harrell FEJ, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15:361–87.CrossRefPubMed Harrell FEJ, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15:361–87.CrossRefPubMed
19.
go back to reference Simon RM, Subramanian J, Li M-C, Menezes S. Using cross-validation to evaluate predictive accuracy of survival risk classifiers based on high-dimensional data. Brief Bioinform. 2011;12(3):203–14.CrossRefPubMedPubMedCentral Simon RM, Subramanian J, Li M-C, Menezes S. Using cross-validation to evaluate predictive accuracy of survival risk classifiers based on high-dimensional data. Brief Bioinform. 2011;12(3):203–14.CrossRefPubMedPubMedCentral
20.
go back to reference Breslow NE. Contribution to the discussion of the paper by DR Cox. J R Stat Soc Ser B. 1972;34:216–7. Breslow NE. Contribution to the discussion of the paper by DR Cox. J R Stat Soc Ser B. 1972;34:216–7.
21.
go back to reference Therneau TM, Grambsch PM. Modeling survival data: extending the Cox model. New York: Springer Science & Business Media; 2000. Therneau TM, Grambsch PM. Modeling survival data: extending the Cox model. New York: Springer Science & Business Media; 2000.
22.
go back to reference Breslow N. Covariance analysis of censored survival data. Biometrics. 1974;30:89–99. Breslow N. Covariance analysis of censored survival data. Biometrics. 1974;30:89–99.
24.
go back to reference Ng P, Maechler M. A fast and efficient implementation of qualitatively constrained quantile smoothing splines. Stat Model. 2007;7:315–28.CrossRef Ng P, Maechler M. A fast and efficient implementation of qualitatively constrained quantile smoothing splines. Stat Model. 2007;7:315–28.CrossRef
27.
go back to reference Royston P, Altman DG, Sauerbrei W. Dichotomizing continuous predictors in multiple regression : a bad idea. Stat Med. 2006;127–141. Royston P, Altman DG, Sauerbrei W. Dichotomizing continuous predictors in multiple regression : a bad idea. Stat Med. 2006;127–141.
28.
29.
go back to reference Graf E, Schmoor C, Sauerbrei W, Schumacher M. Assessment and comparison of prognostic classification schemes for survival data. Stat Med. 1999;18:2529–45.CrossRefPubMed Graf E, Schmoor C, Sauerbrei W, Schumacher M. Assessment and comparison of prognostic classification schemes for survival data. Stat Med. 1999;18:2529–45.CrossRefPubMed
30.
go back to reference Uno H, Cai T, Pencina MJ, D’Agostino RB, Wei LJ. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat Med. 2011;30:1105–17.PubMedPubMedCentral Uno H, Cai T, Pencina MJ, D’Agostino RB, Wei LJ. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat Med. 2011;30:1105–17.PubMedPubMedCentral
31.
go back to reference Song G-D, Sun Y, Shen H, Li W. SOX4 overexpression is a novel biomarker of malignant status and poor prognosis in breast cancer patients. Tumor Biol. 2015;36:4167–73.CrossRef Song G-D, Sun Y, Shen H, Li W. SOX4 overexpression is a novel biomarker of malignant status and poor prognosis in breast cancer patients. Tumor Biol. 2015;36:4167–73.CrossRef
32.
go back to reference Abba MC, Sun H, Hawkins KA, Drake JA, Hu Y, Nunez MI, et al. Breast cancer molecular signatures as determined by SAGE: correlation with lymph node status. Mol Cancer Res. 2007;5:881–90.CrossRefPubMedPubMedCentral Abba MC, Sun H, Hawkins KA, Drake JA, Hu Y, Nunez MI, et al. Breast cancer molecular signatures as determined by SAGE: correlation with lymph node status. Mol Cancer Res. 2007;5:881–90.CrossRefPubMedPubMedCentral
33.
go back to reference Andre F, Dieci MV, Dubsky P, Sotiriou C, Curigliano G, Denkert C, et al. Molecular pathways: involvement of immune pathways in the therapeutic response and outcome in breast cancer. Clin Cancer Res. 2013;19:28–33.CrossRefPubMed Andre F, Dieci MV, Dubsky P, Sotiriou C, Curigliano G, Denkert C, et al. Molecular pathways: involvement of immune pathways in the therapeutic response and outcome in breast cancer. Clin Cancer Res. 2013;19:28–33.CrossRefPubMed
34.
go back to reference Loi S, Michiels S, Salgado R, Sirtaine N, Jose V, Fumagalli D, et al. Tumor infiltrating lymphocytes is prognostic and predictive for trastuzumab benefit in early breast cancer: results from the FinHER trial. Ann Oncol. 2014;25:1544–50.CrossRefPubMed Loi S, Michiels S, Salgado R, Sirtaine N, Jose V, Fumagalli D, et al. Tumor infiltrating lymphocytes is prognostic and predictive for trastuzumab benefit in early breast cancer: results from the FinHER trial. Ann Oncol. 2014;25:1544–50.CrossRefPubMed
35.
go back to reference Ternès N, Rotolo F, Michiels S. Empirical extensions of the lasso penalty to reduce the false discovery rate in high- dimensional Cox regression models. Stat Med. 2016;35:2561–73.CrossRefPubMed Ternès N, Rotolo F, Michiels S. Empirical extensions of the lasso penalty to reduce the false discovery rate in high- dimensional Cox regression models. Stat Med. 2016;35:2561–73.CrossRefPubMed
36.
go back to reference Hayes DF, Markus HS, Leslie RD, Topol EJ. Personalized medicine: risk prediction, targeted therapies and mobile health technology. BMC Med. 2014;12:37.CrossRefPubMedPubMedCentral Hayes DF, Markus HS, Leslie RD, Topol EJ. Personalized medicine: risk prediction, targeted therapies and mobile health technology. BMC Med. 2014;12:37.CrossRefPubMedPubMedCentral
37.
go back to reference Molinaro AM, Simon R, Pfeiffer RM. Prediction error estimation: a comparison of resampling methods. Bioinformatics. 2005;21:3301–7.CrossRefPubMed Molinaro AM, Simon R, Pfeiffer RM. Prediction error estimation: a comparison of resampling methods. Bioinformatics. 2005;21:3301–7.CrossRefPubMed
39.
go back to reference De Bin R, Sauerbrei W, Boulesteix A. Investigating the prediction ability of survival models based on both clinical and omics data : two case studies. Stat Med. 2014;33:5310–29.CrossRefPubMed De Bin R, Sauerbrei W, Boulesteix A. Investigating the prediction ability of survival models based on both clinical and omics data : two case studies. Stat Med. 2014;33:5310–29.CrossRefPubMed
Metadata
Title
Robust estimation of the expected survival probabilities from high-dimensional Cox models with biomarker-by-treatment interactions in randomized clinical trials
Authors
Nils Ternès
Federico Rotolo
Stefan Michiels
Publication date
01-12-2017
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2017
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/s12874-017-0354-0

Other articles of this Issue 1/2017

BMC Medical Research Methodology 1/2017 Go to the issue