Skip to main content
Top
Published in: Health Services and Outcomes Research Methodology 2-3/2012

01-06-2012

Using AIC in multiple linear regression framework with multiply imputed data

Authors: Ashok Chaurasia, Ofer Harel

Published in: Health Services and Outcomes Research Methodology | Issue 2-3/2012

Login to get access

Abstract

Many model selection criteria proposed over the years have become common procedures in applied research. However, these procedures were designed for complete data. Complete data is rare in applied statistics, in particular in medical, public health and health policy settings. Incomplete data, another common problem in applied statistics, introduces its own set of complications in light of which the task of model selection can get quite complicated. Recently, few have suggested model selection procedures for incomplete data with varying degrees of success. In this paper we explore model selection by the Akaike Information Criterion (AIC) in the multivariate regression setting with ignorable missing data accounted for via multiple imputation.
Appendix
Available only for authorised users
Literature
go back to reference Akaike, H.: A new look at statistical model identification. IEEE Trans. Autom. Control 19(6), 716–723 (1974)CrossRef Akaike, H.: A new look at statistical model identification. IEEE Trans. Autom. Control 19(6), 716–723 (1974)CrossRef
go back to reference Burnham, K., Anderson, D.: Multimodel inference: understanding AIC and BIC in model selection. Sociol. Methods Res. 33, 261–304 (2004)CrossRef Burnham, K., Anderson, D.: Multimodel inference: understanding AIC and BIC in model selection. Sociol. Methods Res. 33, 261–304 (2004)CrossRef
go back to reference Chamberlain, T.: The method of multiple working hypotheses. Science 15, 93 (1890) Chamberlain, T.: The method of multiple working hypotheses. Science 15, 93 (1890)
go back to reference Claeskens, G., Consentino, F.: Variable selection with incomplete covariate data. Biometrics 64, 1062–1096 (2008)PubMedCrossRef Claeskens, G., Consentino, F.: Variable selection with incomplete covariate data. Biometrics 64, 1062–1096 (2008)PubMedCrossRef
go back to reference Collins, L., Schafer, J., Kam, C.: A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychol. Methods 6, 330–351 (2001)PubMedCrossRef Collins, L., Schafer, J., Kam, C.: A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychol. Methods 6, 330–351 (2001)PubMedCrossRef
go back to reference Consentino, F., Claeskens, G.: Order selection tests with multiply-imputed data. Comput. Stat. Data Anal. 54(10), 2284–2295 (2010)CrossRef Consentino, F., Claeskens, G.: Order selection tests with multiply-imputed data. Comput. Stat. Data Anal. 54(10), 2284–2295 (2010)CrossRef
go back to reference Harel, O.: Inferences on missing information under multiple imputation and two-stage multiple imputation. Stat. Method. 4, 75 (2007)CrossRef Harel, O.: Inferences on missing information under multiple imputation and two-stage multiple imputation. Stat. Method. 4, 75 (2007)CrossRef
go back to reference Harel, O.: The estimation of R 2 and adjusted R 2 in incomplete data sets using multiple imputation. J. Appl. Stat. 36(10), 1109–1118 (2009)CrossRef Harel, O.: The estimation of R 2 and adjusted R 2 in incomplete data sets using multiple imputation. J. Appl. Stat. 36(10), 1109–1118 (2009)CrossRef
go back to reference Hurvich, C., Tsai, C.: Regression and time series model selection in small samples. Biometrika 76, 297–307 (1989)CrossRef Hurvich, C., Tsai, C.: Regression and time series model selection in small samples. Biometrika 76, 297–307 (1989)CrossRef
go back to reference Hurvich, C., Tsai, C.: The impact of model selection on inference in linear regression. Am. Stat. 44, 214–217 (1990) Hurvich, C., Tsai, C.: The impact of model selection on inference in linear regression. Am. Stat. 44, 214–217 (1990)
go back to reference Ibrahim, J.G.: Incomplete data in generalized linear models. J. Am. Stat. Assoc. 85, 765–769 (1990) Ibrahim, J.G.: Incomplete data in generalized linear models. J. Am. Stat. Assoc. 85, 765–769 (1990)
go back to reference Little, R., Rubin, D.: Statistical Analysis with Missing Data, 2nd edn. Wiley, New York (2002) Little, R., Rubin, D.: Statistical Analysis with Missing Data, 2nd edn. Wiley, New York (2002)
go back to reference Meng, X.-L., Rubin, D.B.: Performing likelihood ratio tests with multiply-imputed data sets. Biometrika 79, 103–11 (1992)CrossRef Meng, X.-L., Rubin, D.B.: Performing likelihood ratio tests with multiply-imputed data sets. Biometrika 79, 103–11 (1992)CrossRef
go back to reference Rao, C., Wu, Y.: Model selection. Lecture Notes-Monograph Series 38, 1–64 (2001)CrossRef Rao, C., Wu, Y.: Model selection. Lecture Notes-Monograph Series 38, 1–64 (2001)CrossRef
go back to reference Rubin, D.: Multiple Imputation for Nonresponse in Surveys. Wiley, New York (1987)CrossRef Rubin, D.: Multiple Imputation for Nonresponse in Surveys. Wiley, New York (1987)CrossRef
go back to reference Schafer, J.: Analysis of Incomplete Multivariate Data. Chapman and Hall, London (1997)CrossRef Schafer, J.: Analysis of Incomplete Multivariate Data. Chapman and Hall, London (1997)CrossRef
go back to reference Schafer, J., Graham, J.: Missing data: our view of the state of the art. Psychol. Methods 7, 147 (2002)PubMedCrossRef Schafer, J., Graham, J.: Missing data: our view of the state of the art. Psychol. Methods 7, 147 (2002)PubMedCrossRef
go back to reference Wood, A.M., White, I.R., Royston, P.: How should variable selection be performed with multiply imputed data? Stat. Med. 27, 3227–3246 (2008)PubMedCrossRef Wood, A.M., White, I.R., Royston, P.: How should variable selection be performed with multiply imputed data? Stat. Med. 27, 3227–3246 (2008)PubMedCrossRef
go back to reference Yang, X., Belin, T.R., Boscardin, W.J.: Imputation and variable selection in linear regression models with missing covariates. Biometrics 61(2), 498–506 (2005)PubMedCrossRef Yang, X., Belin, T.R., Boscardin, W.J.: Imputation and variable selection in linear regression models with missing covariates. Biometrics 61(2), 498–506 (2005)PubMedCrossRef
Metadata
Title
Using AIC in multiple linear regression framework with multiply imputed data
Authors
Ashok Chaurasia
Ofer Harel
Publication date
01-06-2012
Publisher
Springer US
Published in
Health Services and Outcomes Research Methodology / Issue 2-3/2012
Print ISSN: 1387-3741
Electronic ISSN: 1572-9400
DOI
https://doi.org/10.1007/s10742-012-0088-8

Other articles of this Issue 2-3/2012

Health Services and Outcomes Research Methodology 2-3/2012 Go to the issue