Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2020

Open Access 01-12-2020 | Research article

Sample size issues in time series regressions of counts on environmental exposures

Authors: Ben G. Armstrong, Antonio Gasparrini, Aurelio Tobias, Francesco Sera

Published in: BMC Medical Research Methodology | Issue 1/2020

Login to get access

Abstract

Background

Regression analyses of time series of disease counts on environmental determinants are a prominent component of environmental epidemiology. For planning such studies, it can be useful to predict the precision of estimated coefficients and power to detect associations of given magnitude. Existing generic approaches for this have been found somewhat complex to apply and do not easily extend to multiple series studies analysed in two stages. We have sought a simpler approximate approach which can easily extend to multiple series and give insight into factors determining precision.

Methods

We derive approximate expressions for precision and hence power in single and multiple time series studies of counts from basic statistical theory, compare the precision predicted by these with that estimated by analysis in real data from 51 cities of varying size, and illustrate the use of these estimators in a realistic planning scenario.

Results

In single series studies with Poisson outcome distribution, precision and power depend only on the usable variation of exposure (i.e. that conditional on covariates) and the total number of disease events, regardless of how many days those are spread over. In multiple time series (eg multi-city) studies focusing on the meta-analytic mean coefficient, the usable exposure variation and the total number of events (in all series) are again the sole determinants if there is no between-series heterogeneity or within-series overdispersion. With heterogeneity, its extent and the number of series becomes important. For all but the crudest approximation the estimates of standard errors were on average within + 20% of those estimated in full analysis of actual data.

Conclusions

Predicting precision in coefficients from a planned time series study is possible simply and given limited information. The total number of disease events and usable exposure variation are the dominant factors when overdispersion and between-series heterogeneity are low.
Appendix
Available only for authorised users
Literature
1.
go back to reference Atkinson R, Kang S, Anderson H, Mills I, Walton H. Epidemiological time series studies of PM2. 5 and daily mortality and hospital admissions: a systematic review and meta-analysis. Thorax. 2014;69(7):660–5.CrossRef Atkinson R, Kang S, Anderson H, Mills I, Walton H. Epidemiological time series studies of PM2. 5 and daily mortality and hospital admissions: a systematic review and meta-analysis. Thorax. 2014;69(7):660–5.CrossRef
2.
go back to reference Atkinson RW, Mills IC, Walton HA, Anderson HR. Fine particle components and health—a systematic review and meta-analysis of epidemiological time series studies of daily mortality and hospital admissions. J Expo Science Environ Epidemiol. 2015;25(2):208–14.CrossRef Atkinson RW, Mills IC, Walton HA, Anderson HR. Fine particle components and health—a systematic review and meta-analysis of epidemiological time series studies of daily mortality and hospital admissions. J Expo Science Environ Epidemiol. 2015;25(2):208–14.CrossRef
3.
go back to reference Bhaskaran K, Gasparrini A, Hajat S, Smeeth L, Armstrong B. Time series regression studies in environmental epidemiology. Int J Epidemiol. 2013;42:1187–95.CrossRef Bhaskaran K, Gasparrini A, Hajat S, Smeeth L, Armstrong B. Time series regression studies in environmental epidemiology. Int J Epidemiol. 2013;42:1187–95.CrossRef
4.
go back to reference Ye X, Wolff R, Yu W, Vaneckova P, Pan X, Tong S. Ambient temperature and morbidity: a review of epidemiological evidence. Environ Health Perspect. 2012;120(1):19–28.CrossRef Ye X, Wolff R, Yu W, Vaneckova P, Pan X, Tong S. Ambient temperature and morbidity: a review of epidemiological evidence. Environ Health Perspect. 2012;120(1):19–28.CrossRef
5.
go back to reference Yu W, Mengersen K, Wang X, Ye X, Guo Y, Pan X, Tong S. Daily average temperature and mortality among the elderly: a meta-analysis and systematic review of epidemiological evidence. Int J Biometeorol. 2012;56(4):569–81.CrossRef Yu W, Mengersen K, Wang X, Ye X, Guo Y, Pan X, Tong S. Daily average temperature and mortality among the elderly: a meta-analysis and systematic review of epidemiological evidence. Int J Biometeorol. 2012;56(4):569–81.CrossRef
6.
go back to reference Ioannidis JP. Why most published research findings are false. PLoS Med. 2005;2(8):e124.CrossRef Ioannidis JP. Why most published research findings are false. PLoS Med. 2005;2(8):e124.CrossRef
7.
go back to reference Winquist A, Klein M, Tolbert P, Sarnat SE. Power estimation using simulations for air pollution time-series studies. Environ Health. 2012;11(1):68.CrossRef Winquist A, Klein M, Tolbert P, Sarnat SE. Power estimation using simulations for air pollution time-series studies. Environ Health. 2012;11(1):68.CrossRef
8.
go back to reference Lyles RH, Lin HM, Williamson JM. A practical approach to computing power for generalized linear models with nominal, count, or ordinal responses. Stat Med. 2007;26(7):1632–48.CrossRef Lyles RH, Lin HM, Williamson JM. A practical approach to computing power for generalized linear models with nominal, count, or ordinal responses. Stat Med. 2007;26(7):1632–48.CrossRef
9.
go back to reference Self SG, Mauritsen RH. Power/sample size calculations for generalized linear models. Biometrics. 1988;44:79–86.CrossRef Self SG, Mauritsen RH. Power/sample size calculations for generalized linear models. Biometrics. 1988;44:79–86.CrossRef
10.
go back to reference Whittemore AS. Sample size for logistic regression with small response probability. J Am Stat Assoc. 1981;76(373):27–32.CrossRef Whittemore AS. Sample size for logistic regression with small response probability. J Am Stat Assoc. 1981;76(373):27–32.CrossRef
11.
go back to reference Signorini DF. Sample size for Poisson regression. Biometrika. 199;78(2):446–50.CrossRef Signorini DF. Sample size for Poisson regression. Biometrika. 199;78(2):446–50.CrossRef
12.
go back to reference Faul F, Erdfelder E, Buchner A, Lang A-G. Statistical power analyses using G* power 3.1: tests for correlation and regression analyses. Behav Res Methods. 2009;41(4):1149–60.CrossRef Faul F, Erdfelder E, Buchner A, Lang A-G. Statistical power analyses using G* power 3.1: tests for correlation and regression analyses. Behav Res Methods. 2009;41(4):1149–60.CrossRef
14.
go back to reference Rothman KJ, Greenland S. Planning study size based on precision rather than power. Epidemiology. 2018;29(5):599–603.CrossRef Rothman KJ, Greenland S. Planning study size based on precision rather than power. Epidemiology. 2018;29(5):599–603.CrossRef
17.
go back to reference Glinianaia SV, Rankin J, Bell R, Pless-Mulloli T, Howel D. Does particulate air pollution contribute to infant death? A systematic review. Environ Health Perspect. 2004;112(14):1365.CrossRef Glinianaia SV, Rankin J, Bell R, Pless-Mulloli T, Howel D. Does particulate air pollution contribute to infant death? A systematic review. Environ Health Perspect. 2004;112(14):1365.CrossRef
18.
go back to reference Hajat S, Armstrong B, Wilkinson P, Busby A, Dolk H. Outdoor air pollution and infant mortality: analysis of daily time-series data in 10 English cities. J Epidemiol Community Health. 2007;61(8):719–22.CrossRef Hajat S, Armstrong B, Wilkinson P, Busby A, Dolk H. Outdoor air pollution and infant mortality: analysis of daily time-series data in 10 English cities. J Epidemiol Community Health. 2007;61(8):719–22.CrossRef
19.
go back to reference Gasparrini A, Guo Y, Hashizume M, Lavigne E, Zanobetti A, Schwartz J, Tobias A, Tong S, Rocklöv J, Forsberg B, et al. Mortality risk attributable to high and low ambient temperature: a multicountry observational study. Lancet. 2015;386(9991):369–75.CrossRef Gasparrini A, Guo Y, Hashizume M, Lavigne E, Zanobetti A, Schwartz J, Tobias A, Tong S, Rocklöv J, Forsberg B, et al. Mortality risk attributable to high and low ambient temperature: a multicountry observational study. Lancet. 2015;386(9991):369–75.CrossRef
20.
go back to reference Tobías A, Armstrong B, Gasparrini A, Diaz J. Effects of high summer temperatures on mortality in 50 Spanish cities. Environ Health. 2014;13(1):48.CrossRef Tobías A, Armstrong B, Gasparrini A, Diaz J. Effects of high summer temperatures on mortality in 50 Spanish cities. Environ Health. 2014;13(1):48.CrossRef
21.
go back to reference Lu Y, Zeger SL. On the equivalence of case-crossover and time series methods in environmental epidemiology. Biostatistics. 2007;8(2):337–44.CrossRef Lu Y, Zeger SL. On the equivalence of case-crossover and time series methods in environmental epidemiology. Biostatistics. 2007;8(2):337–44.CrossRef
Metadata
Title
Sample size issues in time series regressions of counts on environmental exposures
Authors
Ben G. Armstrong
Antonio Gasparrini
Aurelio Tobias
Francesco Sera
Publication date
01-12-2020
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2020
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/s12874-019-0894-6

Other articles of this Issue 1/2020

BMC Medical Research Methodology 1/2020 Go to the issue