Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2019

Open Access 01-12-2019 | Software

CPMCGLM: an R package for p-value adjustment when looking for an optimal transformation of a single explanatory variable in generalized linear models

Authors: Benoit Liquet, Jérémie Riou

Published in: BMC Medical Research Methodology | Issue 1/2019

Login to get access

Abstract

Background

In medical research, explanatory continuous variables are frequently transformed or converted into categorical variables. If the coding is unknown, many tests can be used to identify the “optimal” transformation. This common process, involving the problems of multiple testing, requires a correction of the significance level.
Liquet and Commenges proposed an asymptotic correction of significance level in the context of generalized linear models (GLM) (Liquet and Commenges, Stat Probab Lett 71:33–38, 2005). This procedure has been developed for dichotomous and Box-Cox transformations. Furthermore, Liquet and Riou suggested the use of resampling methods to estimate the significance level for transformations into categorical variables with more than two levels (Liquet and Riou, BMC Med Res Methodol 13:75, 2013).

Results

CPMCGLM provides to users both methods of p-value adjustment. Futhermore, they are available for a large set of transformations.
This paper aims to provide insight the user an overview of the methodological context, and explain in detail the use of the CPMCGLM R package through its application to a real epidemiological dataset.

Conclusion

We present here the CPMCGLMR package providing efficient methods for the correction of type-I error rate in the context of generalized linear models. This is the first and the only available package in R providing such methods applied to this context.
This package is designed to help researchers, who work principally in the field of biostatistics and epidemiology, to analyze their data in the context of optimal cutoff point determination.
Literature
1.
go back to reference Royston P, Altman DG, Sauerbrei W. Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med. 2006; 25(1):127–41.CrossRef Royston P, Altman DG, Sauerbrei W. Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med. 2006; 25(1):127–41.CrossRef
3.
go back to reference McCullagh P, Nelder JA. Generalized Linear Models, Second Edition. Chapman & Hall/CRC Monographs on Statistics & Applied Probability. London: Taylor & Francis; 1989. McCullagh P, Nelder JA. Generalized Linear Models, Second Edition. Chapman & Hall/CRC Monographs on Statistics & Applied Probability. London: Taylor & Francis; 1989.
4.
go back to reference Rao CR. Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation. In: Mathematical Proceedings of the Cambridge Philosophical Society, vol. 44. Cambridge University Press: 1948. p. 50–57. Rao CR. Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation. In: Mathematical Proceedings of the Cambridge Philosophical Society, vol. 44. Cambridge University Press: 1948. p. 50–57.
5.
go back to reference Berger RL. Multiparameter hypothesis testing and acceptance sampling. Technometrics. 1982; 24(4):295–300.CrossRef Berger RL. Multiparameter hypothesis testing and acceptance sampling. Technometrics. 1982; 24(4):295–300.CrossRef
6.
go back to reference Liquet B, Riou J. Correction of the significance level when attempting multiple transformations of an explanatory variable in generalized linear models. BMC Med Res Methodol. 2013; 13(1):75.CrossRef Liquet B, Riou J. Correction of the significance level when attempting multiple transformations of an explanatory variable in generalized linear models. BMC Med Res Methodol. 2013; 13(1):75.CrossRef
7.
go back to reference Delorme P, Micheaux PL, Liquet B, Riou J. Type-ii generalized family-wise error rate formulas with application to sample size determination. Stat Med. 2016; 35(16):2687–714.CrossRef Delorme P, Micheaux PL, Liquet B, Riou J. Type-ii generalized family-wise error rate formulas with application to sample size determination. Stat Med. 2016; 35(16):2687–714.CrossRef
8.
go back to reference Simes R. An improved Bonferroni procedure for multiple tests of significance. Biometrika. 1986; 73(3):751–4.CrossRef Simes R. An improved Bonferroni procedure for multiple tests of significance. Biometrika. 1986; 73(3):751–4.CrossRef
9.
go back to reference Worsley KJ. An improved bonferroni inequality and applications. Biometrika. 1982; 69:297–302.CrossRef Worsley KJ. An improved bonferroni inequality and applications. Biometrika. 1982; 69:297–302.CrossRef
10.
go back to reference Hochberg Y. A sharper bonferroni procedure for multiple test procedure. Biometrika. 1988; 75:800–2.CrossRef Hochberg Y. A sharper bonferroni procedure for multiple test procedure. Biometrika. 1988; 75:800–2.CrossRef
11.
go back to reference Liquet B, Commenges D. Correction of the p-value after multiple coding of an explanatory variable in logistic regression. Stat Med. 2001; 20:2815–26.CrossRef Liquet B, Commenges D. Correction of the p-value after multiple coding of an explanatory variable in logistic regression. Stat Med. 2001; 20:2815–26.CrossRef
12.
go back to reference Liquet B, Commenges D. Computation of the p-value of the minimum of score tests in the generalized linear model, application to multiple coding. Stat Probab Lett. 2005; 71:33–38.CrossRef Liquet B, Commenges D. Computation of the p-value of the minimum of score tests in the generalized linear model, application to multiple coding. Stat Probab Lett. 2005; 71:33–38.CrossRef
13.
go back to reference Genz A, Bretz F. Computation of Multivariate Normal and T Probabilities. Lecture Notes in Statistics. Heidelberg: Springer; 2009.CrossRef Genz A, Bretz F. Computation of Multivariate Normal and T Probabilities. Lecture Notes in Statistics. Heidelberg: Springer; 2009.CrossRef
15.
go back to reference Romano JP. On the behavior of randomization tests without a group invariance assumption. J Am Stat Assoc. 1990; 85:686.CrossRef Romano JP. On the behavior of randomization tests without a group invariance assumption. J Am Stat Assoc. 1990; 85:686.CrossRef
16.
go back to reference Xu H, Hsu JC. Applying the generalized partitioning principle to control the generalized familywise error rate. Biom J. 2007; 49(1):52–67.CrossRef Xu H, Hsu JC. Applying the generalized partitioning principle to control the generalized familywise error rate. Biom J. 2007; 49(1):52–67.CrossRef
17.
go back to reference Kaizar EE, Li Y, Hsu JC. Permutation multiple tests of binary features do not uniformly control error rates. J Am Stat Assoc. 2011; 106(495):1067–74.CrossRef Kaizar EE, Li Y, Hsu JC. Permutation multiple tests of binary features do not uniformly control error rates. J Am Stat Assoc. 2011; 106(495):1067–74.CrossRef
18.
go back to reference Commenges D, Liquet B. Asymptotic distribution of score statistics for spatial cluster detection with censored data. Biometrics. 2008; 64(4):1287–9.CrossRef Commenges D, Liquet B. Asymptotic distribution of score statistics for spatial cluster detection with censored data. Biometrics. 2008; 64(4):1287–9.CrossRef
19.
go back to reference Commenges D. Transformations which preserve exchangeability and application to permutation tests. J Nonparametric Stat. 2003; 15(2):171–85.CrossRef Commenges D. Transformations which preserve exchangeability and application to permutation tests. J Nonparametric Stat. 2003; 15(2):171–85.CrossRef
20.
go back to reference Westfall PH, Troendle JF. Multiple testing with minimal assumptions. Biom J. 2008; 50(5):745–55.CrossRef Westfall PH, Troendle JF. Multiple testing with minimal assumptions. Biom J. 2008; 50(5):745–55.CrossRef
22.
go back to reference Box GE, Cox DR. An analysis of transformations. J R Stat Soc Ser B Methodol. 1964:211–52. Box GE, Cox DR. An analysis of transformations. J R Stat Soc Ser B Methodol. 1964:211–52.
23.
go back to reference Royston P, Altman DG. Regression using fractional polynomials of continuous covariates: parsimonious parametric modelling. Appl Stat. 1994:429–67.CrossRef Royston P, Altman DG. Regression using fractional polynomials of continuous covariates: parsimonious parametric modelling. Appl Stat. 1994:429–67.CrossRef
24.
go back to reference Royston P, Ambler G, Sauerbrei W. The use of fractional polynomials to model continuous risk variables in epidemiology. Int J Epidemiol. 1999; 28(5):964–74.CrossRef Royston P, Ambler G, Sauerbrei W. The use of fractional polynomials to model continuous risk variables in epidemiology. Int J Epidemiol. 1999; 28(5):964–74.CrossRef
25.
go back to reference Royston P, Altman DG. Approximating statistical functions by using fractional polynomial regression. J R Stat Soc Ser D (The Stat). 1997; 46(3):411–22.CrossRef Royston P, Altman DG. Approximating statistical functions by using fractional polynomial regression. J R Stat Soc Ser D (The Stat). 1997; 46(3):411–22.CrossRef
26.
go back to reference Bonarek M, Barberger-Gateau P, Letenneur L, Deschamps V, Iron A, Dubroca B, Dartigues J. Relationships between cholesterol, apolipoprotein e polymorphism and dementia: a cross-sectional analysis from the paquid study. Neuroepidemiology. 2000; 19:141–48.CrossRef Bonarek M, Barberger-Gateau P, Letenneur L, Deschamps V, Iron A, Dubroca B, Dartigues J. Relationships between cholesterol, apolipoprotein e polymorphism and dementia: a cross-sectional analysis from the paquid study. Neuroepidemiology. 2000; 19:141–48.CrossRef
Metadata
Title
CPMCGLM: an R package for p-value adjustment when looking for an optimal transformation of a single explanatory variable in generalized linear models
Authors
Benoit Liquet
Jérémie Riou
Publication date
01-12-2019
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2019
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/s12874-019-0711-2

Other articles of this Issue 1/2019

BMC Medical Research Methodology 1/2019 Go to the issue