Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2019

Open Access 01-12-2019 | Technical advance

A novel approach to determine two optimal cut-points of a continuous predictor with a U-shaped relationship to hazard ratio in survival data: simulation and application

Authors: Yimin Chen, Jialing Huang, Xianying He, Yongxiang Gao, Gehendra Mahara, Zhuochen Lin, Jinxin Zhang

Published in: BMC Medical Research Methodology | Issue 1/2019

Login to get access

Abstract

Background

In clinical and epidemiological researches, continuous predictors are often discretized into categorical variables for classification of patients. When the relationship between a continuous predictor and log relative hazards is U-shaped in survival data, there is a lack of a satisfying solution to find optimal cut-points to discretize the continuous predictor. In this study, we propose a novel approach named optimal equal-HR method to discretize a continuous variable that has a U-shaped relationship with log relative hazards in survival data.

Methods

The main idea of the optimal equal-HR method is to find two optimal cut-points that have equal log relative hazard values and result in Cox models with minimum AIC value. An R package ‘CutpointsOEHR’ has been developed for easy implementation of the optimal equal-HR method. A Monte Carlo simulation study was carried out to investigate the performance of the optimal equal-HR method. In the simulation process, different censoring proportions, baseline hazard functions and asymmetry levels of U-shaped relationships were chosen. To compare the optimal equal-HR method with other common approaches, the predictive performance of Cox models with variables discretized by different cut-points was assessed.

Results

Simulation results showed that in asymmetric U-shape scenarios the optimal equal-HR method had better performance than the median split method, the upper and lower quantiles method, and the minimum p-value method regarding discrimination ability and overall performance of Cox models. The optimal equal-HR method was applied to a real dataset of small cell lung cancer. The real data example demonstrated that the optimal equal-HR method could provide clinical meaningful cut-points and had good predictive performance in Cox models.

Conclusions

In general, the optimal equal-HR method is recommended to discretize a continuous predictor with right-censored outcomes if the predictor has an asymmetric U-shaped relationship with log relative hazards based on Cox regression models.
Appendix
Available only for authorised users
Literature
1.
go back to reference Cox DR. Regression models and life-tables. J R Stat Soc. 1972;34(2):187–220. Cox DR. Regression models and life-tables. J R Stat Soc. 1972;34(2):187–220.
2.
go back to reference Calabrese EJ, Baldwin LA. U-shaped dose-responses in biology, toxicology, and public health. Annu Rev Public Health. 2001;22:15–33.CrossRef Calabrese EJ, Baldwin LA. U-shaped dose-responses in biology, toxicology, and public health. Annu Rev Public Health. 2001;22:15–33.CrossRef
3.
go back to reference Hamer M, Stamatakis E. U-shaped association between body mass index and psychological distress in a population sample of 114,218 British adults. Mayo Clin Proc. 2017;92(12):1865–6.CrossRef Hamer M, Stamatakis E. U-shaped association between body mass index and psychological distress in a population sample of 114,218 British adults. Mayo Clin Proc. 2017;92(12):1865–6.CrossRef
4.
go back to reference Ervasti J, Kivimaki M, Head J, Goldberg M, Airagnes G, Pentti J, Oksanen T, Salo P, Suominen S, Jokela M, et al. Sickness absence diagnoses among abstainers, low-risk drinkers and at-risk drinkers: consideration of the U-shaped association between alcohol use and sickness absence in four cohort studies. Addiction. 2018;113(9):1633–42.CrossRef Ervasti J, Kivimaki M, Head J, Goldberg M, Airagnes G, Pentti J, Oksanen T, Salo P, Suominen S, Jokela M, et al. Sickness absence diagnoses among abstainers, low-risk drinkers and at-risk drinkers: consideration of the U-shaped association between alcohol use and sickness absence in four cohort studies. Addiction. 2018;113(9):1633–42.CrossRef
5.
go back to reference Du X, Zhu B, Hu G, Mao W, Wang S, Zhang H, Wang F, Shi Z. U-shape association between white blood cell count and the risk of diabetes in young Chinese adults. Diabet Med. 2009;26(10):955–60.CrossRef Du X, Zhu B, Hu G, Mao W, Wang S, Zhang H, Wang F, Shi Z. U-shape association between white blood cell count and the risk of diabetes in young Chinese adults. Diabet Med. 2009;26(10):955–60.CrossRef
6.
go back to reference Govindarajulu US, Spiegelman D, Thurston SW, Ganguli B, Eisen EA. Comparing smoothing techniques in Cox models for exposure-response relationships. Stat Med. 2007;26(20):3735–52.CrossRef Govindarajulu US, Spiegelman D, Thurston SW, Ganguli B, Eisen EA. Comparing smoothing techniques in Cox models for exposure-response relationships. Stat Med. 2007;26(20):3735–52.CrossRef
7.
go back to reference Roshani D, Ghaderi E. Comparing smoothing techniques for fitting the nonlinear effect of covariate in Cox models. Acta Inform Med. 2016;24(1):38–41.CrossRef Roshani D, Ghaderi E. Comparing smoothing techniques for fitting the nonlinear effect of covariate in Cox models. Acta Inform Med. 2016;24(1):38–41.CrossRef
8.
go back to reference Therneau TM, Grambsch PM. Modeling survival data: extending the Cox model. New York: Springer; 2000.CrossRef Therneau TM, Grambsch PM. Modeling survival data: extending the Cox model. New York: Springer; 2000.CrossRef
9.
go back to reference Bouwmeester W, Zuithoff NP, Mallett S, Geerlings MI, Vergouwe Y, Steyerberg EW, Altman DG, Moons KG. Reporting and methods in clinical prediction research: a systematic review. PLoS Med. 2012;9(5):1–12.CrossRef Bouwmeester W, Zuithoff NP, Mallett S, Geerlings MI, Vergouwe Y, Steyerberg EW, Altman DG, Moons KG. Reporting and methods in clinical prediction research: a systematic review. PLoS Med. 2012;9(5):1–12.CrossRef
10.
go back to reference Mabikwa OV, Greenwood DC, Baxter PD, Fleming SJ. Assessing the reporting of categorised quantitative variables in observational epidemiological studies. BMC Health Serv Res. 2017;17(1):201.CrossRef Mabikwa OV, Greenwood DC, Baxter PD, Fleming SJ. Assessing the reporting of categorised quantitative variables in observational epidemiological studies. BMC Health Serv Res. 2017;17(1):201.CrossRef
11.
go back to reference Budczies J, Klauschen F, Sinn BV, Gyorffy B, Schmitt WD, Darb-Esfahani S, Denkert C. Cutoff finder: a comprehensive and straightforward web application enabling rapid biomarker cutoff optimization. PLoS One. 2012;7(12):e51862.CrossRef Budczies J, Klauschen F, Sinn BV, Gyorffy B, Schmitt WD, Darb-Esfahani S, Denkert C. Cutoff finder: a comprehensive and straightforward web application enabling rapid biomarker cutoff optimization. PLoS One. 2012;7(12):e51862.CrossRef
12.
go back to reference Raghavan R, Ashour FS, Bailey R. A review of cutoffs for nutritional biomarkers. Adv Nutr. 2016;7(1):112–20.CrossRef Raghavan R, Ashour FS, Bailey R. A review of cutoffs for nutritional biomarkers. Adv Nutr. 2016;7(1):112–20.CrossRef
13.
go back to reference Prince Nelson SL, Ramakrishnan V, Nietert PJ, Kamen DL, Ramos PS, Wolf BJ. An evaluation of common methods for dichotomization of continuous variables to discriminate disease status. Commun Stat Theory Methods. 2017;46(21):10823–34.CrossRef Prince Nelson SL, Ramakrishnan V, Nietert PJ, Kamen DL, Ramos PS, Wolf BJ. An evaluation of common methods for dichotomization of continuous variables to discriminate disease status. Commun Stat Theory Methods. 2017;46(21):10823–34.CrossRef
14.
go back to reference Knuppel L, Hermsen O. Median split, k-group split, and optimality in continuous populations. Asta-Adv Stat Anal. 2010;94(1):53–74.CrossRef Knuppel L, Hermsen O. Median split, k-group split, and optimality in continuous populations. Asta-Adv Stat Anal. 2010;94(1):53–74.CrossRef
15.
go back to reference Iacobucci D, Posavac SS, Kardes FR, Schneider MJ, Popovich DL. Toward a more nuanced understanding of the statistical properties of a median split. J Consum Psychol. 2015;25(4):652–65.CrossRef Iacobucci D, Posavac SS, Kardes FR, Schneider MJ, Popovich DL. Toward a more nuanced understanding of the statistical properties of a median split. J Consum Psychol. 2015;25(4):652–65.CrossRef
16.
go back to reference Miller R, Siegmund D. Maximally selected chi square statistics. Biometrics. 1982;38(4):1011–6.CrossRef Miller R, Siegmund D. Maximally selected chi square statistics. Biometrics. 1982;38(4):1011–6.CrossRef
17.
go back to reference Mazumdar M, Glassman JR. Categorizing a prognostic variable: review of methods, code for easy implementation and applications to decision-making about cancer treatments. Stat Med. 2000;19(1):113–32.CrossRef Mazumdar M, Glassman JR. Categorizing a prognostic variable: review of methods, code for easy implementation and applications to decision-making about cancer treatments. Stat Med. 2000;19(1):113–32.CrossRef
18.
go back to reference Thinggaard M, Jacobsen R, Jeune B, Martinussen T, Christensen K. Is the relationship between BMI and mortality increasingly U-shaped with advancing age? A 10-year follow-up of persons aged 70-95 years. J Gerontol Ser A-Biol Sci Med Sci. 2010;65(5):526–31.CrossRef Thinggaard M, Jacobsen R, Jeune B, Martinussen T, Christensen K. Is the relationship between BMI and mortality increasingly U-shaped with advancing age? A 10-year follow-up of persons aged 70-95 years. J Gerontol Ser A-Biol Sci Med Sci. 2010;65(5):526–31.CrossRef
19.
go back to reference Gamel JW, McLean IW. A method for determining the optimum transform for covariates of the Cox model with application to 3680 cases of intraocular melanoma. Comput Biomed Res. 1988;21(5):471–7.CrossRef Gamel JW, McLean IW. A method for determining the optimum transform for covariates of the Cox model with application to 3680 cases of intraocular melanoma. Comput Biomed Res. 1988;21(5):471–7.CrossRef
20.
go back to reference Klein JP, Wu JT. Discretizing a continuous covariate in survival studies. Handbook Stat. 2003;23(03):27–42.CrossRef Klein JP, Wu JT. Discretizing a continuous covariate in survival studies. Handbook Stat. 2003;23(03):27–42.CrossRef
21.
go back to reference Molinari N, Daures JP, Durand JF. Regression splines for threshold selection in survival data analysis. Stat Med. 2001;20(2):237–47.CrossRef Molinari N, Daures JP, Durand JF. Regression splines for threshold selection in survival data analysis. Stat Med. 2001;20(2):237–47.CrossRef
22.
go back to reference Eilers PHC, Marx BD. Flexible smoothing with B-splines and penalties. In: Statistical Science, vol. 1996; 1996. p. 89–121. Eilers PHC, Marx BD. Flexible smoothing with B-splines and penalties. In: Statistical Science, vol. 1996; 1996. p. 89–121.
23.
go back to reference Meira-Machado L, Cadarso-Suarez C, Gude F, Araujo A. smoothHR: an R package for pointwise nonparametric estimation of hazard ratio curves of continuous predictors. Comput Math Methods Med. 2013;2013:745742.CrossRef Meira-Machado L, Cadarso-Suarez C, Gude F, Araujo A. smoothHR: an R package for pointwise nonparametric estimation of hazard ratio curves of continuous predictors. Comput Math Methods Med. 2013;2013:745742.CrossRef
24.
go back to reference Bender R, Augustin T, Blettner M. Generating survival times to simulate Cox proportional hazards models. Stat Med. 2005;24(11):1713–23.CrossRef Bender R, Augustin T, Blettner M. Generating survival times to simulate Cox proportional hazards models. Stat Med. 2005;24(11):1713–23.CrossRef
25.
go back to reference Foley JP, Dorsey JG. Equations for calculation of chromatographic figures of merit for ideal and skewed peaks. Anal Chem. 1983;55(4):730–7.CrossRef Foley JP, Dorsey JG. Equations for calculation of chromatographic figures of merit for ideal and skewed peaks. Anal Chem. 1983;55(4):730–7.CrossRef
26.
go back to reference Harrell FE Jr, Lee KL, Califf RM, Pryor DB, Rosati RA. Regression modelling strategies for improved prognostic prediction. Stat Med. 1984;3(2):143–52.CrossRef Harrell FE Jr, Lee KL, Califf RM, Pryor DB, Rosati RA. Regression modelling strategies for improved prognostic prediction. Stat Med. 1984;3(2):143–52.CrossRef
27.
go back to reference Gonen M, Heller G. Concordance probability and discriminatory power in proportional hazards regression. Biometrika. 2005;92(4):965–70.CrossRef Gonen M, Heller G. Concordance probability and discriminatory power in proportional hazards regression. Biometrika. 2005;92(4):965–70.CrossRef
28.
go back to reference Graf E, Schmoor C, Sauerbrei W, Schumacher M. Assessment and comparison of prognostic classification schemes for survival data. Stat Med. 1999;18(17–18):2529–45.CrossRef Graf E, Schmoor C, Sauerbrei W, Schumacher M. Assessment and comparison of prognostic classification schemes for survival data. Stat Med. 1999;18(17–18):2529–45.CrossRef
29.
go back to reference Tawn JA. Measures of dependence for censored survival data. Biometrika. 1988;75(3):525–34.CrossRef Tawn JA. Measures of dependence for censored survival data. Biometrika. 1988;75(3):525–34.CrossRef
30.
go back to reference Royston P, Sauerbrei W. A new measure of prognostic separation in survival data. Stat Med. 2004;23(5):723–48.CrossRef Royston P, Sauerbrei W. A new measure of prognostic separation in survival data. Stat Med. 2004;23(5):723–48.CrossRef
31.
go back to reference Faraggi D, Simon R. A simulation study of cross-validation for selecting an optimal cutpoint in univariate survival analysis. Stat Med. 1996;15(20):2203–13.CrossRef Faraggi D, Simon R. A simulation study of cross-validation for selecting an optimal cutpoint in univariate survival analysis. Stat Med. 1996;15(20):2203–13.CrossRef
32.
go back to reference Ferrigno D, Buccheri G, Ricca I. Prognostic significance of blood coagulation tests in lung cancer. Eur Respir J. 2001;17(4):667–73.CrossRef Ferrigno D, Buccheri G, Ricca I. Prognostic significance of blood coagulation tests in lung cancer. Eur Respir J. 2001;17(4):667–73.CrossRef
33.
go back to reference Tas F, Kilic L, Serilmez M, Keskin S, Sen F, Duranyildiz D. Clinical and prognostic significance of coagulation assays in lung cancer. Respir Med. 2013;107(3):451–7.CrossRef Tas F, Kilic L, Serilmez M, Keskin S, Sen F, Duranyildiz D. Clinical and prognostic significance of coagulation assays in lung cancer. Respir Med. 2013;107(3):451–7.CrossRef
34.
go back to reference Zhu LR, Li J, Chen P, Jiang Q, Tang XP. Clinical significance of plasma fibrinogen and D-dimer in predicting the chemotherapy efficacy and prognosis for small cell lung cancer patients. Clin Transl Oncol. 2016;18(2):178–88.CrossRef Zhu LR, Li J, Chen P, Jiang Q, Tang XP. Clinical significance of plasma fibrinogen and D-dimer in predicting the chemotherapy efficacy and prognosis for small cell lung cancer patients. Clin Transl Oncol. 2016;18(2):178–88.CrossRef
35.
go back to reference Pan H, Shi X, Xiao D, He J, Zhang Y, Liang W, Zhao Z, Guo Z, Zou X, Zhang J, et al. Nomogram prediction for the survival of the patients with small cell lung cancer. J Thorac Dis. 2017;9(3):507–18.CrossRef Pan H, Shi X, Xiao D, He J, Zhang Y, Liang W, Zhao Z, Guo Z, Zou X, Zhang J, et al. Nomogram prediction for the survival of the patients with small cell lung cancer. J Thorac Dis. 2017;9(3):507–18.CrossRef
36.
go back to reference Mazumdar M, Smith A, Bacik J. Methods for categorizing a prognostic variable in a multivariable setting. Stat Med. 2003;22(4):559–71.CrossRef Mazumdar M, Smith A, Bacik J. Methods for categorizing a prognostic variable in a multivariable setting. Stat Med. 2003;22(4):559–71.CrossRef
37.
go back to reference Camp RL, Dolled-Filhart M, Rimm DL. X-tile: a new bio-informatics tool for biomarker assessment and outcome-based cut-point optimization. Clin Cancer Res. 2004;10(21):7252–9.CrossRef Camp RL, Dolled-Filhart M, Rimm DL. X-tile: a new bio-informatics tool for biomarker assessment and outcome-based cut-point optimization. Clin Cancer Res. 2004;10(21):7252–9.CrossRef
38.
go back to reference Royston P, Altman DG, Sauerbrei W. Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med. 2006;25(1):127–41.CrossRef Royston P, Altman DG, Sauerbrei W. Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med. 2006;25(1):127–41.CrossRef
39.
go back to reference Altman DG, Royston P. The cost of dichotomising continuous variables. BMJ. 2006;332(7549):1080.CrossRef Altman DG, Royston P. The cost of dichotomising continuous variables. BMJ. 2006;332(7549):1080.CrossRef
40.
go back to reference Barnwell-Menard JL, Li Q, Cohen AA. Effects of categorization method, regression type, and variable distribution on the inflation of type-I error rate when categorizing a confounding variable. Stat Med. 2015;34(6):936–49.CrossRef Barnwell-Menard JL, Li Q, Cohen AA. Effects of categorization method, regression type, and variable distribution on the inflation of type-I error rate when categorizing a confounding variable. Stat Med. 2015;34(6):936–49.CrossRef
41.
go back to reference Simonsohn U. Two lines: a valid alternative to the invalid testing of U-shaped relationships with quadratic regressions. Adv Methods Pract Psychol Sci. 2018. Simonsohn U. Two lines: a valid alternative to the invalid testing of U-shaped relationships with quadratic regressions. Adv Methods Pract Psychol Sci. 2018.
Metadata
Title
A novel approach to determine two optimal cut-points of a continuous predictor with a U-shaped relationship to hazard ratio in survival data: simulation and application
Authors
Yimin Chen
Jialing Huang
Xianying He
Yongxiang Gao
Gehendra Mahara
Zhuochen Lin
Jinxin Zhang
Publication date
01-12-2019
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2019
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/s12874-019-0738-4

Other articles of this Issue 1/2019

BMC Medical Research Methodology 1/2019 Go to the issue