Skip to main content
Top
Published in: Prevention Science 3/2007

01-09-2007

How Many Imputations are Really Needed? Some Practical Clarifications of Multiple Imputation Theory

Authors: John W. Graham, Allison E. Olchowski, Tamika D. Gilreath

Published in: Prevention Science | Issue 3/2007

Login to get access

Abstract

Multiple imputation (MI) and full information maximum likelihood (FIML) are the two most common approaches to missing data analysis. In theory, MI and FIML are equivalent when identical models are tested using the same variables, and when m, the number of imputations performed with MI, approaches infinity. However, it is important to know how many imputations are necessary before MI and FIML are sufficiently equivalent in ways that are important to prevention scientists. MI theory suggests that small values of m, even on the order of three to five imputations, yield excellent results. Previous guidelines for sufficient m are based on relative efficiency, which involves the fraction of missing information (γ) for the parameter being estimated, and m. In the present study, we used a Monte Carlo simulation to test MI models across several scenarios in which γ and m were varied. Standard errors and p-values for the regression coefficient of interest varied as a function of m, but not at the same rate as relative efficiency. Most importantly, statistical power for small effect sizes diminished as m became smaller, and the rate of this power falloff was much greater than predicted by changes in relative efficiency. Based our findings, we recommend that researchers using MI should perform many more imputations than previously considered sufficient. These recommendations are based on γ, and take into consideration one’s tolerance for a preventable power falloff (compared to FIML) due to using too few imputations.
Literature
go back to reference Cohen, J. (1977). Statistical power analysis for the behavioral sciences. New York: Academic. Cohen, J. (1977). Statistical power analysis for the behavioral sciences. New York: Academic.
go back to reference Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330–351.PubMedCrossRef Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330–351.PubMedCrossRef
go back to reference Graham, J. W. (2003). Adding missing-data relevant variables to FIML-based structural equation models. Structural Equation Modeling, 10, 80–100.CrossRef Graham, J. W. (2003). Adding missing-data relevant variables to FIML-based structural equation models. Structural Equation Modeling, 10, 80–100.CrossRef
go back to reference Graham, J. W., Cumsille, P. E., & Elek-Fisk, E. (2003). Methods for handling missing data. In: J. A. Schinka & W. F. Velicer (Eds.), Research methods in psychology (pp. 87–114). Volume 2 of Handbook of Psychology (I. B. Weiner, Editor-in-Chief). New York: Wiley. Graham, J. W., Cumsille, P. E., & Elek-Fisk, E. (2003). Methods for handling missing data. In: J. A. Schinka & W. F. Velicer (Eds.), Research methods in psychology (pp. 87–114). Volume 2 of Handbook of Psychology (I. B. Weiner, Editor-in-Chief). New York: Wiley.
go back to reference King, G., Honaker, J., Joseph, A., & Scheve, K. (2001). Analyzing incomplete political science data: an alternative algorithm for multiple imputation. American Political Science Review, 95, 49–69. King, G., Honaker, J., Joseph, A., & Scheve, K. (2001). Analyzing incomplete political science data: an alternative algorithm for multiple imputation. American Political Science Review, 95, 49–69.
go back to reference Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley. Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.
go back to reference Schafer, J. L. (1997). Analysis of incomplete multivariate data. New York: Chapman and Hall. Schafer, J. L. (1997). Analysis of incomplete multivariate data. New York: Chapman and Hall.
go back to reference Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7, 147–177.PubMedCrossRef Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7, 147–177.PubMedCrossRef
go back to reference Schafer, J. L., & Olsen, M. K. (1998). Multiple imputation for multivariate missing data problems: A data analyst’s perspective. Multivariate Behavioral Research, 33, 545–571.CrossRef Schafer, J. L., & Olsen, M. K. (1998). Multiple imputation for multivariate missing data problems: A data analyst’s perspective. Multivariate Behavioral Research, 33, 545–571.CrossRef
Metadata
Title
How Many Imputations are Really Needed? Some Practical Clarifications of Multiple Imputation Theory
Authors
John W. Graham
Allison E. Olchowski
Tamika D. Gilreath
Publication date
01-09-2007
Publisher
Springer US
Published in
Prevention Science / Issue 3/2007
Print ISSN: 1389-4986
Electronic ISSN: 1573-6695
DOI
https://doi.org/10.1007/s11121-007-0070-9

Other articles of this Issue 3/2007

Prevention Science 3/2007 Go to the issue