Skip to main content
Top
Published in: Prevention Science 7/2015

01-10-2015

Sample Size Considerations in Prevention Research Applications of Multilevel Modeling and Structural Equation Modeling

Authors: Rick H. Hoyle, Nisha C. Gottfredson

Published in: Prevention Science | Issue 7/2015

Login to get access

Abstract

When the goal of prevention research is to capture in statistical models some measure of the dynamic complexity in structures and processes implicated in problem behavior and its prevention, approaches such as multilevel modeling (MLM) and structural equation modeling (SEM) are indicated. Yet the assumptions that must be satisfied if these approaches are to be used responsibly raise concerns regarding their use in prevention research involving smaller samples. In this article, we discuss in nontechnical terms the role of sample size in MLM and SEM and present findings from the latest simulation work on the performance of each approach at sample sizes typical of prevention research. For each statistical approach, we draw from extant simulation studies to establish lower bounds for sample size (e.g., MLM can be applied with as few as ten groups comprising ten members with normally distributed data, restricted maximum likelihood estimation, and a focus on fixed effects; sample sizes as small as N = 50 can produce reliable SEM results with normally distributed data and at least three reliable indicators per factor) and suggest strategies for making the best use of the modeling approach when N is near the lower bound.
Footnotes
1
Nested data may also emerge from analyses aimed at accounting for unobserved heterogeneity in outcomes such as in latent class analysis or growth mixture models with longitudinal data. Relatively little is known about sample size requirements for analyses of these types but they almost certainly require samples that are large. For that reason, those models fall outside the scope of this manuscript.
 
2
A number of software packages can handle such analyses, including (but certainly not limited to): HLM, Mplus, R, SAS, and Stata.
 
3
We omit a discussion of least squares approaches because these tend to be less efficient than maximum likelihood (Singer and Willett 2003).
 
4
We recognized that 30 groups is unrealistic for many prevention studies. This simulation study did not consider fewer than 30 groups given that the results were already somewhat problematic for that number. Findings from work considering 10 level 2 units is presented below.
 
5
However, progress is being made on this front. See Noh and Lee (2007) for an example.
 
6
Bauer and Sterba (2011) also found that increasing the number of response categories resulted in less bias and greater efficiency. This result is not surprising given the well-known consequences of dichotomization (MacCallum et al. 2002). Whenever possible, researchers should maximize the number of response categories or use a continuous response scale to maximize power.
 
7
Software packages that can estimate SEMs include, but are not limited to: EQS, LISREL, Mplus, R, and STATA.
 
Literature
go back to reference Austin, P. C. (2010). Estimating multilevel logistic regression models when the number of clusters is low: A comparison of different statistical software procedures. The International Journal of Biostatistics, 6, 1–18. doi:10.2202/1557-4679.1195. Austin, P. C. (2010). Estimating multilevel logistic regression models when the number of clusters is low: A comparison of different statistical software procedures. The International Journal of Biostatistics, 6, 1–18. doi:10.​2202/​1557-4679.​1195.
go back to reference Bandalos, D. L., & Gagné, P. (2012). Simulation methods in structural equation modeling. In R. H. Hoyle (Ed.), Handbook of structural equation modeling (pp. 92–108). New York: Guildford Press. Bandalos, D. L., & Gagné, P. (2012). Simulation methods in structural equation modeling. In R. H. Hoyle (Ed.), Handbook of structural equation modeling (pp. 92–108). New York: Guildford Press.
go back to reference Bearden, W. O., Sharma, S., & Teel, J. E. (1982). Sample size effects on chi square and other statistics used in evaluating causal models. Journal of Marketing Research, 19, 425–430. doi:10.2307/3151716.CrossRef Bearden, W. O., Sharma, S., & Teel, J. E. (1982). Sample size effects on chi square and other statistics used in evaluating causal models. Journal of Marketing Research, 19, 425–430. doi:10.​2307/​3151716.CrossRef
go back to reference Bollen, K. A., & Curran, P. J. (2006). Latent curve models: A structural equation approach. Hoboken: Wiley. Bollen, K. A., & Curran, P. J. (2006). Latent curve models: A structural equation approach. Hoboken: Wiley.
go back to reference Curran, P. J., Lee, T., Howard, A. L., Lane, S., & MacCallum, R. (2012). Disaggregating within-person and between-person effects in multilevel and structural equation growth models. In J. R. Harring & G. R. Hancock (Eds.), Advances in longitudinal models in the social and behavioral sciences (pp. 217–253). Charlotte: Information Age Publishing. Curran, P. J., Lee, T., Howard, A. L., Lane, S., & MacCallum, R. (2012). Disaggregating within-person and between-person effects in multilevel and structural equation growth models. In J. R. Harring & G. R. Hancock (Eds.), Advances in longitudinal models in the social and behavioral sciences (pp. 217–253). Charlotte: Information Age Publishing.
go back to reference Demidenko, E. (2004). Mixed models: Theory and applications. Hoboken: Wiley.CrossRef Demidenko, E. (2004). Mixed models: Theory and applications. Hoboken: Wiley.CrossRef
go back to reference Fan, X., Thompson, B., & Wang, L. (1999). Effects of sample size, estimation methods, and model specification on structural equation modeling fit indexes. Structural Equation Modeling, 6, 56–83. doi:10.1080/10705519909540119.CrossRef Fan, X., Thompson, B., & Wang, L. (1999). Effects of sample size, estimation methods, and model specification on structural equation modeling fit indexes. Structural Equation Modeling, 6, 56–83. doi:10.​1080/​1070551990954011​9.CrossRef
go back to reference Hopkin, C. R., Hoyle, R. H., & Gottfredson, N. C. (2013). Maximizing the yield of small samples in prevention research: A review of general strategies and best practices. Manuscript submitted for publication. Hopkin, C. R., Hoyle, R. H., & Gottfredson, N. C. (2013). Maximizing the yield of small samples in prevention research: A review of general strategies and best practices. Manuscript submitted for publication.
go back to reference Hoyle, R. H. (2011). Structural equation modeling for social and personality psychology. London: Sage Publications. Hoyle, R. H. (2011). Structural equation modeling for social and personality psychology. London: Sage Publications.
go back to reference Hoyle, R. H., & Kenny, D. A. (1999). Sample size, reliability, and tests of statistical mediation. In R. H. Hoyle (Ed.), Statistical strategies for small sample research (pp. 195–222). Thousand Oaks: Sage Publications. Hoyle, R. H., & Kenny, D. A. (1999). Sample size, reliability, and tests of statistical mediation. In R. H. Hoyle (Ed.), Statistical strategies for small sample research (pp. 195–222). Thousand Oaks: Sage Publications.
go back to reference Hu, L.-T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55. doi:10.1080/10705519909540118.CrossRef Hu, L.-T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55. doi:10.​1080/​1070551990954011​8.CrossRef
go back to reference Kaplan, D. (1995). Statistical power in structural equation modeling. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 100–117). Newbury Park: Sage Publications. Kaplan, D. (1995). Statistical power in structural equation modeling. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 100–117). Newbury Park: Sage Publications.
go back to reference Kreft, I. G. G., & de Leeuw, J. (1988). Introducing multilevel modeling. Thousand Oaks: Sage Publications. Kreft, I. G. G., & de Leeuw, J. (1988). Introducing multilevel modeling. Thousand Oaks: Sage Publications.
go back to reference Lüdtke, O., Marsh, H. W., Robitzsch, A., Trautwein, U., Asparouhov, T., & Muthén, B. (2008). The multilevel latent covariate model: A new, more reliable approach to group-level effects in contextual studies. Psychological Methods, 13, 203–229. doi:10.1037/a0012869.CrossRefPubMed Lüdtke, O., Marsh, H. W., Robitzsch, A., Trautwein, U., Asparouhov, T., & Muthén, B. (2008). The multilevel latent covariate model: A new, more reliable approach to group-level effects in contextual studies. Psychological Methods, 13, 203–229. doi:10.​1037/​a0012869.CrossRefPubMed
go back to reference Marsh, H. W., Hau, K.-T., Balla, J. R., & Grayson, D. (1998). Is more ever too much? The number of indicators per factor in confirmatory factor analysis. Multivariate Behavioral Research, 33, 181–220. doi:10.1207/s15327906mbr3302_1.CrossRef Marsh, H. W., Hau, K.-T., Balla, J. R., & Grayson, D. (1998). Is more ever too much? The number of indicators per factor in confirmatory factor analysis. Multivariate Behavioral Research, 33, 181–220. doi:10.​1207/​s15327906mbr3302​_​1.CrossRef
go back to reference McArdle, J. J., & Nesselroade, J. R. (2003). Growth curve analysis in contemporary psychological research. In J. Schinka & W. Velicer (Eds.), Comprehensive handbook of psychology: Research methods in psychology (pp. 447–480). New York: Wiley. doi:10.1002/0471264385. McArdle, J. J., & Nesselroade, J. R. (2003). Growth curve analysis in contemporary psychological research. In J. Schinka & W. Velicer (Eds.), Comprehensive handbook of psychology: Research methods in psychology (pp. 447–480). New York: Wiley. doi:10.​1002/​0471264385.
go back to reference Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods. Thousand Oaks: Sage Publications. Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods. Thousand Oaks: Sage Publications.
go back to reference Raudenbush, S. W., Bryk, A. S., & Congdon, R. (2004). HLM 6 for Windows [Computer software]. Lincolnwood: Scientific Software. Raudenbush, S. W., Bryk, A. S., & Congdon, R. (2004). HLM 6 for Windows [Computer software]. Lincolnwood: Scientific Software.
go back to reference Raudenbush, S. W., Spybrook, J., Congdon, R., Liu, X., & Martinez, A. (2011). Optimal Design Software for multi-level and longitudinal research (version 3.01) [Software]. Available from www.wtgrantfoundation.org. Raudenbush, S. W., Spybrook, J., Congdon, R., Liu, X., & Martinez, A. (2011). Optimal Design Software for multi-level and longitudinal research (version 3.01) [Software]. Available from www.​wtgrantfoundatio​n.​org.
go back to reference Rodríguez, G., & Goldman, N. (2001). Improved estimation procedures for multilevel models with binary response: A case study. Journal of the Royal Statistical Society: Series A (Statistics in Society), 164, 339–355. doi:10.1111/1467-985X.00206.CrossRef Rodríguez, G., & Goldman, N. (2001). Improved estimation procedures for multilevel models with binary response: A case study. Journal of the Royal Statistical Society: Series A (Statistics in Society), 164, 339–355. doi:10.​1111/​1467-985X.​00206.CrossRef
go back to reference Shiyko, M. P., Lanza, S. T., Tan, X., Li, R., & Shiffman, S. (2012). Using the time-varying effect model (TVEM) to examine dynamic associations between negative affect and self-confidence on smoking urges: Differences between successful quitters and relapsers. Prevention Science, 13, 288–299. doi:10.1007/s11121-011-0264-z.PubMedCentralCrossRefPubMed Shiyko, M. P., Lanza, S. T., Tan, X., Li, R., & Shiffman, S. (2012). Using the time-varying effect model (TVEM) to examine dynamic associations between negative affect and self-confidence on smoking urges: Differences between successful quitters and relapsers. Prevention Science, 13, 288–299. doi:10.​1007/​s11121-011-0264-z.PubMedCentralCrossRefPubMed
go back to reference Simon, T. R., Ikeda, R. M., Smith, E. P., Reese, L. E., Rabiner, D. L., Miller-Johnson, S., et al. (2008). The multisite violence prevention project: Impact of a universal school-based violence prevention program on social-cognitive outcomes. Prevention Science, 9, 231–244. doi:10.1007/s11121-008-0101-1.PubMedCentralCrossRef Simon, T. R., Ikeda, R. M., Smith, E. P., Reese, L. E., Rabiner, D. L., Miller-Johnson, S., et al. (2008). The multisite violence prevention project: Impact of a universal school-based violence prevention program on social-cognitive outcomes. Prevention Science, 9, 231–244. doi:10.​1007/​s11121-008-0101-1.PubMedCentralCrossRef
go back to reference Singer, J. D., & Willett, J. B. (2003). Applied longitudinal data analysis: Modeling change and event occurrence. New York: Oxford University Press.CrossRef Singer, J. D., & Willett, J. B. (2003). Applied longitudinal data analysis: Modeling change and event occurrence. New York: Oxford University Press.CrossRef
go back to reference Snijders, T. A. B., & Bosker, R. J. (2004). Multilevel analysis: An introduction to basic and advanced multilevel modeling (2nd ed.). London: Sage Publications. Snijders, T. A. B., & Bosker, R. J. (2004). Multilevel analysis: An introduction to basic and advanced multilevel modeling (2nd ed.). London: Sage Publications.
go back to reference Steiger, J. H., & Lind, J. C. (1980). Statistically based tests for the number of common factors. Iowa City: Paper presented at the Meeting of the Psychometric Society. Steiger, J. H., & Lind, J. C. (1980). Statistically based tests for the number of common factors. Iowa City: Paper presented at the Meeting of the Psychometric Society.
go back to reference Whiteside, S. P., & Lynam, D. R. (2001). The five factor model and impulsivity: Using a structural model of personality to understand impulsivity. Personality and Individual Differences, 30, 669–689. doi:10.1016/S0191-8869(00)00064-7.CrossRef Whiteside, S. P., & Lynam, D. R. (2001). The five factor model and impulsivity: Using a structural model of personality to understand impulsivity. Personality and Individual Differences, 30, 669–689. doi:10.​1016/​S0191-8869(00)00064-7.CrossRef
Metadata
Title
Sample Size Considerations in Prevention Research Applications of Multilevel Modeling and Structural Equation Modeling
Authors
Rick H. Hoyle
Nisha C. Gottfredson
Publication date
01-10-2015
Publisher
Springer US
Published in
Prevention Science / Issue 7/2015
Print ISSN: 1389-4986
Electronic ISSN: 1573-6695
DOI
https://doi.org/10.1007/s11121-014-0489-8

Other articles of this Issue 7/2015

Prevention Science 7/2015 Go to the issue