Top

BMC Medical Research Methodology

Published in:

Open Access 01-12-2015 | Research article

Comparing denominator degrees of freedom approximations for the generalized linear mixed model in analyzing binary outcome in small sample cluster-randomized trials

Authors: Peng Li, David T Redden

Published in: BMC Medical Research Methodology | Issue 1/2015

Abstract

Background

Small number of clusters and large variation of cluster sizes commonly exist in cluster-randomized trials (CRTs) and are often the critical factors affecting the validity and efficiency of statistical analyses. F tests are commonly used in the generalized linear mixed model (GLMM) to test intervention effects in CRTs. The most challenging issue for the approximate Wald F test is the estimation of the denominator degrees of freedom (DDF). Some DDF approximation methods have been proposed, but their small sample performances in analysing binary outcomes in CRTs with few heterogeneous clusters are not well studied.

Methods

The small sample performances of five DDF approximations for the F test are compared and contrasted under CRT frameworks with simulations. Specifically, we illustrate how the intraclass correlation (ICC), sample size, and the variation of cluster sizes affect the type I error and statistical power when different DDF approximation methods in GLMM are used to test intervention effect in CRTs with binary outcomes. The results are also illustrated using a real CRT dataset.

Results

Our simulation results suggest that the Between-Within method maintains the nominal type I error rates even when the total number of clusters is as low as 10 and is robust to the variation of the cluster sizes. The Residual and Containment methods have inflated type I error rates when the cluster number is small (<30) and the inflation becomes more severe with increased variation in cluster sizes. In contrast, the Satterthwaite and Kenward-Roger methods can provide tests with very conservative type I error rates when the total cluster number is small (<30) and the conservativeness becomes more severe as variation in cluster sizes increases. Our simulations also suggest that the Between-Within method is statistically more powerful than the Satterthwaite or Kenward-Roger method in analysing CRTs with heterogeneous cluster sizes, especially when the cluster number is small.

Conclusion

We conclude that the Between-Within denominator degrees of freedom approximation method for F tests should be recommended when the GLMM is used in analysing CRTs with binary outcomes and few heterogeneous clusters, due to its type I error properties and relatively higher power.

Campbell MJ, Donner A, Klar N. Developments in cluster randomized trials and statistics in medicine. Stat Med. 2007;26(1):2–19.CrossRefPubMed

Donner A, Klar N. Pitfalls of and controversies in cluster randomization trials. Am J Publ Health. 2004;94(3):416–22.CrossRef

Eldridge SM, Ukoumunne OC, Carlin JB. The intra-cluster correlation coefficient in cluster randomized trials: a review of definitions. Int Stat Rev. 2009;77(3):378–94.CrossRef

Campbell MK, Piaggio G, Elbourne DR, Altman DG, Group C. Consort 2010 statement: extension to cluster randomised trials. BMJ. 2012;345:e5661.CrossRefPubMed

Murray DM, Varnell SP, Blitstein JL. Design and analysis of group-randomized trials: a review of recent methodological developments. Am J Publ Health. 2004;94(3):423–32.CrossRef

Harville DA. Maximum likelihood approaches to variance component estimation and to related problems. J Am Stat Assoc. 1977;72:9.

Liang KY, Zeger SL. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73(1):10.CrossRef

Vonesh EF. Generalized linear and nonlinear models for correlated data: theory and applications using SAS. Cary, NC: SAS Institute, Inc; 2012.

Bolker BM, Brooks ME, Clark CJ, Geange SW, Poulsen JR, Stevens MHH, et al. Generalized linear mixed models: a practical guide for ecology and evolution. Trends Ecol Evol. 2009;24(3):127–35.CrossRefPubMed

10.

Bellamy SL, Li Y, Lin XH, Ryan LM. Quantifying PQL bias in estimating cluster-level covariate effects in generalized linear mixed models for group-randomized trials. Stat Sinica. 2005;15(4):1015–32.

11.

Littell RC, Milliken GA, Stroup WW, Wolfinger RD, Schabenberger O. SAS® for Mixed Models. 2nd ed. Cary, NC: SAS Institute Inc.; 2006.

12.

Spilke J, Piepho HP, Hu XY. A simulation study on tests of hypotheses and confidence intervals for fixed effects in mixed models for blocked experiments with missing data. J Agric Biol Envir S. 2005;10(3):374–89.CrossRef

13.

Fai AHT, Cornelius PL. Approximate F-tests of multiple degree of freedom hypotheses in generalized least squares analyses of unbalanced split-plot experiments. J Stat Comput Sim. 1996;54(4):363–78.CrossRef

14.

Kenward MG, Roger JH. Small sample inference for fixed effects from restricted maximum likelihood. Biometrics. 1997;53(3):983–97.CrossRefPubMed

15.

Schaalje GB, McBride JB, Fellingham GW. Adequacy of approximations to distributions of test statistics in complex mixed linear models. J Agric Biol Envir S. 2002;7(4):512–24.CrossRef

16.

Breslow NE, Clayton DG. Approximate inference in generalized linear mixed models. J Am Stat Assoc. 1993;88(421):9–25.

17.

McCulloch CE. Maximum likelihood algorithms for generalized linear mixed models. J Am Stat Assoc. 1997;92(437):162–70.CrossRef

18.

Schluchter MDaE JD. Small-sample adjustments to tests with unbalanced repeated measures assuming several covariance structures. J Stat Comput Sim. 1990;37:19.

19.

Satterthwaite FE. An approximate distribution of estimates of variance components. Biom Bull. 1946;2:110–4.CrossRef

20.

Lee EW, Dubin N. Estimation and sample-size considerations for clustered binary responses. Stat Med. 1994;13(12):1241–52.CrossRefPubMed

21.

Gulliford MC, Adams G, Ukoumunne OC, Latinovic R, Chinn S, Campbell MJ. Intraclass correlation coefficient and outcome prevalence are associated in clustered binary data. J Clin Epidemiol. 2005;58(3):246–51.CrossRefPubMed

22.

Omar RZ, Thompson SG. Analysis of a cluster randomized trial with binary outcome data using a multi-level model. Stat Med. 2000;19(19):2675–88.CrossRefPubMed

23.

Turner RM, Omar RZ, Thompson SG. Bayesian methods of analysis for cluster randomized trials with binary outcome data. Stat Med. 2001;20(3):453–72.CrossRefPubMed

24.

Elston DA. Estimation of denominator degrees of freedom of F-distributions for assessing Wald statistics for fixed-effect factors in unbalanced mixed models. Biometrics. 1998;54(3):1085–96.CrossRef

Title: Comparing denominator degrees of freedom approximations for the generalized linear mixed model in analyzing binary outcome in small sample cluster-randomized trials
Authors: Peng Li
David T Redden
Publication date: 01-12-2015
Publisher: BioMed Central
Published in: BMC Medical Research Methodology / Issue 1/2015
Electronic ISSN: 1471-2288
DOI: https://doi.org/10.1186/s12874-015-0026-x

At a glance: The STEP trials

Springer Medicine

Comparing denominator degrees of freedom approximations for the generalized linear mixed model in analyzing binary outcome in small sample cluster-randomized trials

Abstract

Background

Methods

Results

Conclusion

At a glance: The STEP trials

Springer Medicine

Abstract

Background

Methods

Results

Conclusion

Please log in to get access to this content

Other articles of this Issue 1/2015

Impact of preconception enrollment on birth enrollment and timing of exposure assessment in the initial vanguard cohort of the U.S. National Children’s Study

The rise of multiple imputation: a review of the reporting and implementation of the method in medical research

A general framework for comparative Bayesian meta-analysis of diagnostic studies

Aiming for a representative sample: Simulating random versus purposive strategies for hospital selection

Personalized contact strategies and predictors of time to survey completion: analysis of two sequential randomized trials

The heterogeneity statistic I2 can be biased in small meta-analyses