Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2013

Open Access 01-12-2013 | Research article

Use of generalised additive models to categorise continuous variables in clinical prediction

Authors: Irantzu Barrio, Inmaculada Arostegui, José M Quintana, IRYSS-COPD Group

Published in: BMC Medical Research Methodology | Issue 1/2013

Login to get access

Abstract

Background

In medical practice many, essentially continuous, clinical parameters tend to be categorised by physicians for ease of decision-making. Indeed, categorisation is a common practice both in medical research and in the development of clinical prediction rules, particularly where the ensuing models are to be applied in daily clinical practice to support clinicians in the decision-making process. Since the number of categories into which a continuous predictor must be categorised depends partly on the relationship between the predictor and the outcome, the need for more than two categories must be borne in mind.

Methods

We propose a categorisation methodology for clinical-prediction models, using Generalised Additive Models (GAMs) with P-spline smoothers to determine the relationship between the continuous predictor and the outcome. The proposed method consists of creating at least one average-risk category along with high- and low-risk categories based on the GAM smooth function. We applied this methodology to a prospective cohort of patients with exacerbated chronic obstructive pulmonary disease. The predictors selected were respiratory rate and partial pressure of carbon dioxide in the blood (PCO2), and the response variable was poor evolution. An additive logistic regression model was used to show the relationship between the covariates and the dichotomous response variable. The proposed categorisation was compared to the continuous predictor as the best option, using the AIC and AUC evaluation parameters. The sample was divided into a derivation (60%) and validation (40%) samples. The first was used to obtain the cut points while the second was used to validate the proposed methodology.

Results

The three-category proposal for the respiratory rate was ≤ 20;(20,24];> 24, for which the following values were obtained: AIC=314.5 and AUC=0.638. The respective values for the continuous predictor were AIC=317.1 and AUC=0.634, with no statistically significant differences being found between the two AUCs (p =0.079). The four-category proposal for PCO2 was ≤ 43;(43,52];(52,65];> 65, for which the following values were obtained: AIC=258.1 and AUC=0.81. No statistically significant differences were found between the AUC of the four-category option and that of the continuous predictor, which yielded an AIC of 250.3 and an AUC of 0.825 (p =0.115).

Conclusions

Our proposed method provides clinicians with the number and location of cut points for categorising variables, and performs as successfully as the original continuous predictor when it comes to developing clinical prediction rules.
Appendix
Available only for authorised users
Literature
1.
go back to reference Turner E, Dobson J, Pocock J: Categorisation of continuous risk factors in epidemiological publications: a survey of current practice. Epidemiol Perspect Innov. 2010, 7: 9-10.1186/1742-5573-7-9.CrossRefPubMedPubMedCentral Turner E, Dobson J, Pocock J: Categorisation of continuous risk factors in epidemiological publications: a survey of current practice. Epidemiol Perspect Innov. 2010, 7: 9-10.1186/1742-5573-7-9.CrossRefPubMedPubMedCentral
2.
go back to reference Steyerberg EW: Clinical Prediction Models. A Practical Approach to Development, Validation, and Updating. 2009, New York: Springer Steyerberg EW: Clinical Prediction Models. A Practical Approach to Development, Validation, and Updating. 2009, New York: Springer
3.
go back to reference Royston P, Altam D, Sauerbrei W: Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med. 2006, 25: 127—141-CrossRefPubMed Royston P, Altam D, Sauerbrei W: Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med. 2006, 25: 127—141-CrossRefPubMed
4.
go back to reference Froslie K, Roislien J, Laake P, Henriksen T, Qvigstad E, Veierod M: Categorisation of continuous exposure variables revisited. A response to the Hyperglycaemia and Adverse Pregnancy Outcome (HAPO) Study. BMC Med Res Methodol. 2010, 10: 103-10.1186/1471-2288-10-103.CrossRefPubMedPubMedCentral Froslie K, Roislien J, Laake P, Henriksen T, Qvigstad E, Veierod M: Categorisation of continuous exposure variables revisited. A response to the Hyperglycaemia and Adverse Pregnancy Outcome (HAPO) Study. BMC Med Res Methodol. 2010, 10: 103-10.1186/1471-2288-10-103.CrossRefPubMedPubMedCentral
5.
go back to reference Bouwmeester W, Zuithoff N, Mallett S, Geerlings M, Vergouwe Y, Steyerberg E, Altman D, Moons K: Reporting and methods in clinical prediction research: a systematic review. Plos Med. 2012, 9: e1001221-10.1371/journal.pmed.1001221.CrossRefPubMedCentral Bouwmeester W, Zuithoff N, Mallett S, Geerlings M, Vergouwe Y, Steyerberg E, Altman D, Moons K: Reporting and methods in clinical prediction research: a systematic review. Plos Med. 2012, 9: e1001221-10.1371/journal.pmed.1001221.CrossRefPubMedCentral
6.
go back to reference Mazumdar M, Glassman J: Categorizing a prognostic variable: review of methods, code for easy implementation and applications to decision-making about cancer treatments. Stat Med. 2000, 19: 113-132. 10.1002/(SICI)1097-0258(20000115)19:1<113::AID-SIM245>3.0.CO;2-O.CrossRefPubMed Mazumdar M, Glassman J: Categorizing a prognostic variable: review of methods, code for easy implementation and applications to decision-making about cancer treatments. Stat Med. 2000, 19: 113-132. 10.1002/(SICI)1097-0258(20000115)19:1<113::AID-SIM245>3.0.CO;2-O.CrossRefPubMed
7.
go back to reference Lim B, Kelly A: A meta-analysis on the utility of peripheral venous blood gas analyses in exacerbations of chronic obstructive pulmonary disease in the emergency department. Eur J Emerg Med. 2010, 17: 246-248. 10.1097/MEJ.0b013e328335622a.CrossRefPubMed Lim B, Kelly A: A meta-analysis on the utility of peripheral venous blood gas analyses in exacerbations of chronic obstructive pulmonary disease in the emergency department. Eur J Emerg Med. 2010, 17: 246-248. 10.1097/MEJ.0b013e328335622a.CrossRefPubMed
8.
go back to reference Hin L, Lau T, Rogers M, Chang M: Dichotomization of continuous measurements using generalized additive modelling - application in predicting intrapartum caesarean delivery. Stat Med. 1999, 18: 1101-1110. 10.1002/(SICI)1097-0258(19990515)18:9<1101::AID-SIM99>3.0.CO;2-Q.CrossRefPubMed Hin L, Lau T, Rogers M, Chang M: Dichotomization of continuous measurements using generalized additive modelling - application in predicting intrapartum caesarean delivery. Stat Med. 1999, 18: 1101-1110. 10.1002/(SICI)1097-0258(19990515)18:9<1101::AID-SIM99>3.0.CO;2-Q.CrossRefPubMed
9.
go back to reference Hansson L, Zanchetti A, Carruthers S, Dahlöf B, Elmfeldt D, Julius S, Ménard J, Rahn K, Wedel H, Westerling S: Effects of intensive blood-pressure lowering and low-dose aspirin in patients with hypertension: principal results of the Hypertension Optimal Treatment (HOT) randomised trial. Lancet. 1998, 351: 1755-1762. 10.1016/S0140-6736(98)04311-6.CrossRefPubMed Hansson L, Zanchetti A, Carruthers S, Dahlöf B, Elmfeldt D, Julius S, Ménard J, Rahn K, Wedel H, Westerling S: Effects of intensive blood-pressure lowering and low-dose aspirin in patients with hypertension: principal results of the Hypertension Optimal Treatment (HOT) randomised trial. Lancet. 1998, 351: 1755-1762. 10.1016/S0140-6736(98)04311-6.CrossRefPubMed
10.
go back to reference Bennette C, Vickers A: Against quantiles: categorization of continuous variables in epidemiologic research, and its discontents. BMC Med Res Methodol. 2012, 12: 21-10.1186/1471-2288-12-21.CrossRefPubMedPubMedCentral Bennette C, Vickers A: Against quantiles: categorization of continuous variables in epidemiologic research, and its discontents. BMC Med Res Methodol. 2012, 12: 21-10.1186/1471-2288-12-21.CrossRefPubMedPubMedCentral
11.
go back to reference Hastie T, Tibshirani R: Generalized Additive Models. 1990, London: Chapman & Hall Hastie T, Tibshirani R: Generalized Additive Models. 1990, London: Chapman & Hall
12.
go back to reference Currie I, Durban M, Eilers P: Generalized linear array models with applications to multidimensional smoothing. J R Stat Soc B. 2006, 68: 259-280. 10.1111/j.1467-9868.2006.00543.x.CrossRef Currie I, Durban M, Eilers P: Generalized linear array models with applications to multidimensional smoothing. J R Stat Soc B. 2006, 68: 259-280. 10.1111/j.1467-9868.2006.00543.x.CrossRef
13.
go back to reference Green P, Silverman B: Nonparametric Regression and Generalized Linear Models. 1994, London: Chapman & HallCrossRef Green P, Silverman B: Nonparametric Regression and Generalized Linear Models. 1994, London: Chapman & HallCrossRef
14.
go back to reference Ruppert D: Selecting the number of knots and for penalized splines. J Comp Graph Stat. 2002, 11: 735-757. 10.1198/106186002853.CrossRef Ruppert D: Selecting the number of knots and for penalized splines. J Comp Graph Stat. 2002, 11: 735-757. 10.1198/106186002853.CrossRef
15.
go back to reference Eilers P, Marx B: Flexible smoothing with B-splines and penalties. Stat Sci. 1996, 11: 89-121. 10.1214/ss/1038425655.CrossRef Eilers P, Marx B: Flexible smoothing with B-splines and penalties. Stat Sci. 1996, 11: 89-121. 10.1214/ss/1038425655.CrossRef
16.
go back to reference Quintana J, Esteban C, Barrio I, Garcia S, Gonzalez N, Arostegui I, Lafuente I, Bare M, Blasco J, Vidal S, TI G: The IRYSS-COPD appropriateness study: objectives, methodology, and description of the prospective cohort. BMC Health Serv Res. 2011, 11: 322-10.1186/1472-6963-11-322.CrossRefPubMedPubMedCentral Quintana J, Esteban C, Barrio I, Garcia S, Gonzalez N, Arostegui I, Lafuente I, Bare M, Blasco J, Vidal S, TI G: The IRYSS-COPD appropriateness study: objectives, methodology, and description of the prospective cohort. BMC Health Serv Res. 2011, 11: 322-10.1186/1472-6963-11-322.CrossRefPubMedPubMedCentral
17.
go back to reference Akaike H: A new look at the statistical model identification. IEEE T Automat Contr. 1974, 19: 716-723. 10.1109/TAC.1974.1100705.CrossRef Akaike H: A new look at the statistical model identification. IEEE T Automat Contr. 1974, 19: 716-723. 10.1109/TAC.1974.1100705.CrossRef
18.
go back to reference Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez J: Müller M: pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics. 2011, 12: 77-10.1186/1471-2105-12-77.CrossRefPubMedPubMedCentral Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez J: Müller M: pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics. 2011, 12: 77-10.1186/1471-2105-12-77.CrossRefPubMedPubMedCentral
19.
20.
go back to reference Quintana JM, Garcia-Gutierrez S, Aguirre U, Gonzalez-Hernandez N: Estándares de uso adecuado de tecnologías sanitarias. Creación de criterios explícitos de indicación de ingreso hospitalario en la exacerbación de EPOC. 2008, Madrid: Agencia Laín Entralgo Quintana JM, Garcia-Gutierrez S, Aguirre U, Gonzalez-Hernandez N: Estándares de uso adecuado de tecnologías sanitarias. Creación de criterios explícitos de indicación de ingreso hospitalario en la exacerbación de EPOC. 2008, Madrid: Agencia Laín Entralgo
21.
go back to reference Rice J, Wu C: Nonparametric mixed effects models for unequally sample noisy curves. Biometrics. 2001, 57: 253-259. 10.1111/j.0006-341X.2001.00253.x.CrossRefPubMed Rice J, Wu C: Nonparametric mixed effects models for unequally sample noisy curves. Biometrics. 2001, 57: 253-259. 10.1111/j.0006-341X.2001.00253.x.CrossRefPubMed
22.
23.
go back to reference Marx B, Eilers P: Direct generalized additive modeling with penalized likelihood. Comput Stat Data Anal. 1998, 28: 193-209. 10.1016/S0167-9473(98)00033-4.CrossRef Marx B, Eilers P: Direct generalized additive modeling with penalized likelihood. Comput Stat Data Anal. 1998, 28: 193-209. 10.1016/S0167-9473(98)00033-4.CrossRef
Metadata
Title
Use of generalised additive models to categorise continuous variables in clinical prediction
Authors
Irantzu Barrio
Inmaculada Arostegui
José M Quintana
IRYSS-COPD Group
Publication date
01-12-2013
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2013
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/1471-2288-13-83

Other articles of this Issue 1/2013

BMC Medical Research Methodology 1/2013 Go to the issue