Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2016

Open Access 01-12-2016 | Research article

The case-crossover design via penalized regression

Authors: Sam Doerken, Maja Mockenhaupt, Luigi Naldi, Martin Schumacher, Peggy Sekula

Published in: BMC Medical Research Methodology | Issue 1/2016

Login to get access

Abstract

Background

The case-crossover design is an attractive alternative to the classical case–control design which can be used to study the onset of acute events if the risk factors of interest vary in time. By comparing exposures within cases at different time periods, the case-crossover design does not rely on control subjects which can be difficult to acquire. However, using the standard method of maximum likelihood, resulting risk estimates can be heavily biased when the prevalence to risk factors is very low (or very high).

Methods

To overcome the problem of low risk factor prevalences, penalized conditional logistic regression via the lasso (least absolute shrinkage and selection operator) has been proposed in the literature as well as related methods such as the Firth correction. We apply and compare several penalized regression approaches in the context of a case-crossover analysis of the European Study of Severe Cutaneous Adverse Reactions (EuroSCAR; 1997–2001).

Results

Out of 30 drugs, standard methods only correctly classified 17 drugs (including some highly implausible risk estimates), while penalized methods correctly classified 22 drugs.

Conclusion

Penalized methods generally yield better risk classifications and much more plausible risk estimates for the EuroSCAR study than standard methods. As these novel techniques can be easily implemented using available R packages, we encourage routine use of penalized conditional logistic regression for case-crossover data.
Appendix
Available only for authorised users
Literature
1.
go back to reference Breslow NE, Day NE. Statistical Methods in Cancer Research. Vol. 1. The Analysis of Case–control Studies. (IARC Scientific Publication no. 32). Lyon: International Agency for Research on Cancer; 1980. Breslow NE, Day NE. Statistical Methods in Cancer Research. Vol. 1. The Analysis of Case–control Studies. (IARC Scientific Publication no. 32). Lyon: International Agency for Research on Cancer; 1980.
2.
go back to reference Maclure M. The case-crossover design: A method for studying transient effects on the risk of acute event. Am J Epidemiol. 1991;133(2):144–53.CrossRefPubMed Maclure M. The case-crossover design: A method for studying transient effects on the risk of acute event. Am J Epidemiol. 1991;133(2):144–53.CrossRefPubMed
3.
go back to reference Greenland S. Invited commentary: variable selection versus shrinkage in the control of multiple confounders. Am J Epidemiol. 2008;167:523–9.CrossRefPubMed Greenland S. Invited commentary: variable selection versus shrinkage in the control of multiple confounders. Am J Epidemiol. 2008;167:523–9.CrossRefPubMed
4.
go back to reference Cole SR, Chu H, Greenland S. Maximum likelihood, profile likelihood, and penalized likelihood: a primer. Am J Epidemiol. 2014;179(2):252–60.CrossRefPubMed Cole SR, Chu H, Greenland S. Maximum likelihood, profile likelihood, and penalized likelihood: a primer. Am J Epidemiol. 2014;179(2):252–60.CrossRefPubMed
5.
go back to reference Avalos M, Grandvalet Y, Adroher ND, et al. Analysis of multiple exposures in the case-crossover design via sparse conditional likelihood. Stat Med. 2012;31(21):2290–302.CrossRefPubMed Avalos M, Grandvalet Y, Adroher ND, et al. Analysis of multiple exposures in the case-crossover design via sparse conditional likelihood. Stat Med. 2012;31(21):2290–302.CrossRefPubMed
6.
go back to reference Avalos M, Orriols L, Pouyes H, et al. Variable selection on large case-crossover data: application to a registry-based study of prescription drugs and road traffic crashes. Pharmacoepidemiol Drug Saf. 2014;23:140–51.CrossRefPubMed Avalos M, Orriols L, Pouyes H, et al. Variable selection on large case-crossover data: application to a registry-based study of prescription drugs and road traffic crashes. Pharmacoepidemiol Drug Saf. 2014;23:140–51.CrossRefPubMed
7.
go back to reference Mockenhaupt M, Viboud C, Dunant A, et al. Stevens-Johnson syndrome and toxic epidermal necrolysis: assessment of medication risks with emphasis on recently marketed drugs. The EuroSCAR study. J Invest Dermatol. 2008;128(1):35–44.CrossRefPubMed Mockenhaupt M, Viboud C, Dunant A, et al. Stevens-Johnson syndrome and toxic epidermal necrolysis: assessment of medication risks with emphasis on recently marketed drugs. The EuroSCAR study. J Invest Dermatol. 2008;128(1):35–44.CrossRefPubMed
8.
go back to reference Viboud C, Boëlle PY, Kelly J, et al. Comparison of the statistical efficiency of case-crossover and case–control designs: Application to severe cutaneous adverse reactions. J Clin Epidemiol. 2001;54:1218–27.CrossRefPubMed Viboud C, Boëlle PY, Kelly J, et al. Comparison of the statistical efficiency of case-crossover and case–control designs: Application to severe cutaneous adverse reactions. J Clin Epidemiol. 2001;54:1218–27.CrossRefPubMed
9.
go back to reference Hosmer DW, Lemeshow S. Applied Logistic Regression. 3rd ed. Hoboken: Wiley-Interscience; 2013.CrossRef Hosmer DW, Lemeshow S. Applied Logistic Regression. 3rd ed. Hoboken: Wiley-Interscience; 2013.CrossRef
10.
go back to reference Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference and Prediction. New York: Springer Publishing Company; 2009.CrossRef Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference and Prediction. New York: Springer Publishing Company; 2009.CrossRef
11.
go back to reference Bach F. Bolasso: model consistent lasso estimation through the bootstrap. McCallum A, Roweis S, Cohen W, (eds). In Proceedings of the 25th International Conference on Machine Learning (ICML 2008). Helsinki, Finland; 2008. Bach F. Bolasso: model consistent lasso estimation through the bootstrap. McCallum A, Roweis S, Cohen W, (eds). In Proceedings of the 25th International Conference on Machine Learning (ICML 2008). Helsinki, Finland; 2008.
12.
go back to reference De Bin R, Janitza S, Sauerbrei W, et al. Subsampling versus bootstrapping in resampling-based model selection for multivariable regression. Biom. 2016;72(1):272–280. De Bin R, Janitza S, Sauerbrei W, et al. Subsampling versus bootstrapping in resampling-based model selection for multivariable regression. Biom. 2016;72(1):272–280.
14.
go back to reference Firth D. Bias reduction of maximum likelihood estimates. Biometrika. 1993;80(1):27–38.CrossRef Firth D. Bias reduction of maximum likelihood estimates. Biometrika. 1993;80(1):27–38.CrossRef
15.
go back to reference Zeng C, Newcomer SR, Glanz JM, et al. Bias correction of risk estimates in vaccine safety studies with rare adverse events using a self-controlled case series design. Am J Epidemiol. 2013;178(12):1750–9.CrossRefPubMed Zeng C, Newcomer SR, Glanz JM, et al. Bias correction of risk estimates in vaccine safety studies with rare adverse events using a self-controlled case series design. Am J Epidemiol. 2013;178(12):1750–9.CrossRefPubMed
16.
go back to reference Heinze G, Puhr R. Bias-reduced and separation-proof conditional logistic regression with small or sparse data sets. Stat Med. 2010;29(7–8):770–7.CrossRefPubMed Heinze G, Puhr R. Bias-reduced and separation-proof conditional logistic regression with small or sparse data sets. Stat Med. 2010;29(7–8):770–7.CrossRefPubMed
17.
go back to reference Sun JX, Sinha S, Wang S, et al. Bias reduction in conditional logistic regression. Stat Med. 2011;30(4):348–55.CrossRefPubMed Sun JX, Sinha S, Wang S, et al. Bias reduction in conditional logistic regression. Stat Med. 2011;30(4):348–55.CrossRefPubMed
18.
go back to reference Reid S, Tibshirani R. clogitL1. R package version 1.4. 2014. Reid S, Tibshirani R. clogitL1. R package version 1.4. 2014.
19.
go back to reference Reid S, Tibshirani R. Regularization paths for conditional logistic regression: the clogitL1 package. J Stat Softw. 2014;58(12):1–23.CrossRef Reid S, Tibshirani R. Regularization paths for conditional logistic regression: the clogitL1 package. J Stat Softw. 2014;58(12):1–23.CrossRef
21.
go back to reference Avalos M, Grandvalet Y, Pouyes H, et al. High-dimensional sparse matched case–control and case-crossover data: A review of recent works, description of an R tool and an illustration of the use in epidemiological studies. In: Formenti E, Tagliaferri R, Wit E, editors. Computational Intelligence Methods for Bioinformatics and Biostatistics, vol. 8452. Lecture Notes in Computer Science. Cham, Switzerland: Springer International Publishing; 2014. p. 109–24. Avalos M, Grandvalet Y, Pouyes H, et al. High-dimensional sparse matched case–control and case-crossover data: A review of recent works, description of an R tool and an illustration of the use in epidemiological studies. In: Formenti E, Tagliaferri R, Wit E, editors. Computational Intelligence Methods for Bioinformatics and Biostatistics, vol. 8452. Lecture Notes in Computer Science. Cham, Switzerland: Springer International Publishing; 2014. p. 109–24.
22.
go back to reference Sun H, Wang S. Network-based regularization for matched case–control analysis of high-dimensional DNA methylation data. Stat Med. 2013;32(21):2127–39.CrossRefPubMed Sun H, Wang S. Network-based regularization for matched case–control analysis of high-dimensional DNA methylation data. Stat Med. 2013;32(21):2127–39.CrossRefPubMed
23.
go back to reference Ploner M, Heinze G. coxphf: Cox regression with Firth's penalized likelihood. R package version 1.11. 2015. Ploner M, Heinze G. coxphf: Cox regression with Firth's penalized likelihood. R package version 1.11. 2015.
24.
go back to reference Papay J, Yuen N, Mockenhaupt M, et al. Spontaneous adverse event reports of Stevens-Johnson syndrome/toxic epidermal necrolysis: detecting associations with medications. Pharmacoepidemiol Drug Saf. 2012;21(3):289–96.CrossRefPubMed Papay J, Yuen N, Mockenhaupt M, et al. Spontaneous adverse event reports of Stevens-Johnson syndrome/toxic epidermal necrolysis: detecting associations with medications. Pharmacoepidemiol Drug Saf. 2012;21(3):289–96.CrossRefPubMed
25.
go back to reference Herring AH. Nonparametric bayes shrinkage for assessing exposures to mixtures subject to limits of detection. Epidemiology. 2010;21 Suppl 4:71–6.CrossRef Herring AH. Nonparametric bayes shrinkage for assessing exposures to mixtures subject to limits of detection. Epidemiology. 2010;21 Suppl 4:71–6.CrossRef
26.
go back to reference Chatterjee A, Lahiri SN. Bootstrap lasso estimators. JASA. 2011;106(494):608–25.CrossRef Chatterjee A, Lahiri SN. Bootstrap lasso estimators. JASA. 2011;106(494):608–25.CrossRef
27.
go back to reference Greenland S, Schwartzbaum JA, Finkle WD. Problems due to small samples and sparse data in conditional logistic regression analysis. Am J Epidemiol. 2000;151(5):531–9.CrossRefPubMed Greenland S, Schwartzbaum JA, Finkle WD. Problems due to small samples and sparse data in conditional logistic regression analysis. Am J Epidemiol. 2000;151(5):531–9.CrossRefPubMed
28.
go back to reference Rose S. Mortality risk score prediction in an elderly population using machine learning. Am J Epidemiol. 2013;177(5):443–52.CrossRefPubMed Rose S. Mortality risk score prediction in an elderly population using machine learning. Am J Epidemiol. 2013;177(5):443–52.CrossRefPubMed
29.
go back to reference Burgette LF, Reiter JP, Miranda ML. Exploratory quantile regression with many covariates: an application to adverse birth outcomes. Epidemiology. 2011;22(6):859–66.CrossRefPubMed Burgette LF, Reiter JP, Miranda ML. Exploratory quantile regression with many covariates: an application to adverse birth outcomes. Epidemiology. 2011;22(6):859–66.CrossRefPubMed
30.
31.
go back to reference Mostofsky E, Schwartz J, Coull BA, et al. Modeling the association between particle constituents of air pollution and health outcomes. Am J Epidemiol. 2012;176(4):317–26.CrossRefPubMedPubMedCentral Mostofsky E, Schwartz J, Coull BA, et al. Modeling the association between particle constituents of air pollution and health outcomes. Am J Epidemiol. 2012;176(4):317–26.CrossRefPubMedPubMedCentral
32.
go back to reference Sullivan SG, Greenland S. Bayesian regression in SAS software. Int J Epidemiol. 2013;42(1):308–17.CrossRefPubMed Sullivan SG, Greenland S. Bayesian regression in SAS software. Int J Epidemiol. 2013;42(1):308–17.CrossRefPubMed
34.
go back to reference Farrington CP. Relative incidence estimation from case series for vaccine safety. Biometrics. 1995;51(1):228–35.CrossRefPubMed Farrington CP. Relative incidence estimation from case series for vaccine safety. Biometrics. 1995;51(1):228–35.CrossRefPubMed
35.
go back to reference Avalos M, Pouyes H, Grandvalet Y, et al. Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm. BMC Bioinformatics. 2015;16 Suppl 6:51.CrossRef Avalos M, Pouyes H, Grandvalet Y, et al. Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm. BMC Bioinformatics. 2015;16 Suppl 6:51.CrossRef
36.
go back to reference Avalos M, Adroher ND, Lagarde E, et al. Prescription-drug-related risk in driving: comparing conventional and lasso shrinkage logistic regression. Epidemiology. 2012;23(5):706–12.CrossRefPubMed Avalos M, Adroher ND, Lagarde E, et al. Prescription-drug-related risk in driving: comparing conventional and lasso shrinkage logistic regression. Epidemiology. 2012;23(5):706–12.CrossRefPubMed
Metadata
Title
The case-crossover design via penalized regression
Authors
Sam Doerken
Maja Mockenhaupt
Luigi Naldi
Martin Schumacher
Peggy Sekula
Publication date
01-12-2016
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2016
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/s12874-016-0197-0

Other articles of this Issue 1/2016

BMC Medical Research Methodology 1/2016 Go to the issue