Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2023

Open Access 01-12-2023 | Research

Multiple imputation methods for missing multilevel ordinal outcomes

Authors: Mei Dong, Aya Mitani

Published in: BMC Medical Research Methodology | Issue 1/2023

Login to get access

Abstract

Background

Multiple imputation (MI) is an established technique for handling missing data in observational studies. Joint modelling (JM) and fully conditional specification (FCS) are commonly used methods for imputing multilevel data. However, MI methods for multilevel ordinal outcome variables have not been well studied, especially when cluster size is informative on the outcome. The purpose of this study is to describe and compare different MI strategies for dealing with multilevel ordinal outcomes when informative cluster size (ICS) exists.

Methods

We conducted comprehensive Monte Carlo simulation studies to compare the performance of five strategies: complete case analysis (CCA), FCS, FCS+CS (including cluster size (CS) in the imputation model), JM, and JM+CS under various scenarios. We evaluated their performance using a proportional odds logistic regression model estimated with cluster weighted generalized estimating equations (CWGEE).

Results

The simulation results showed that including CS in the imputation model can significantly improve estimation accuracy when ICS exists. FCS provided more accurate and robust estimation than JM, followed by CCA for multilevel ordinal outcomes. We further applied these strategies to a real dental study to assess the association between metabolic syndrome and clinical attachment loss scores. The results based on FCS + CS indicated that the power of the analysis would increase after carrying out the appropriate MI strategy.

Conclusions

MI is an effective tool to increase the accuracy and power of the downstream statistical analysis for missing ordinal outcomes. FCS slightly outperforms JM when imputing multilevel ordinal outcomes. When there is plausible ICS, we recommend including CS in the imputation phase.
Appendix
Available only for authorised users
Literature
1.
go back to reference Hoffman EB, Sen PK, Weinberg CR. Within-cluster resampling. Biometrika. 2001;88(4):1121–34.CrossRef Hoffman EB, Sen PK, Weinberg CR. Within-cluster resampling. Biometrika. 2001;88(4):1121–34.CrossRef
2.
go back to reference Dutta S. Robust Testing of Paired Outcomes Incorporating Covariate Effects in Clustered Data with Informative Cluster Size. Stats. 2022;5(4):1321–33.CrossRef Dutta S. Robust Testing of Paired Outcomes Incorporating Covariate Effects in Clustered Data with Informative Cluster Size. Stats. 2022;5(4):1321–33.CrossRef
3.
go back to reference Shen B, Chen C, Chinchilli VM, Ghahramani N, Zhang L, Wang M. Semiparametric marginal methods for clustered data adjusting for informative cluster size with nonignorable zeros. Biom J. 2022;64(5):898–911.CrossRefPubMed Shen B, Chen C, Chinchilli VM, Ghahramani N, Zhang L, Wang M. Semiparametric marginal methods for clustered data adjusting for informative cluster size with nonignorable zeros. Biom J. 2022;64(5):898–911.CrossRefPubMed
4.
go back to reference Williamson JM, Kim HY, Warner L. Weighting condom use data to account for nonignorable cluster size. Ann Epidemiol. 2007;17(8):603–7.CrossRefPubMed Williamson JM, Kim HY, Warner L. Weighting condom use data to account for nonignorable cluster size. Ann Epidemiol. 2007;17(8):603–7.CrossRefPubMed
5.
go back to reference Seaman S, Pavlou M, Copas A. Review of methods for handling confounding by cluster and informative cluster size in clustered data. Stat Med. 2014;33(30):5371–87.CrossRefPubMedPubMedCentral Seaman S, Pavlou M, Copas A. Review of methods for handling confounding by cluster and informative cluster size in clustered data. Stat Med. 2014;33(30):5371–87.CrossRefPubMedPubMedCentral
6.
go back to reference Pavlou M, Ambler G, Omar RZ. Risk prediction in multicentre studies when there is confounding by cluster or informative cluster size. BMC Med Res Methodol. 2021;21(1):1–14.CrossRef Pavlou M, Ambler G, Omar RZ. Risk prediction in multicentre studies when there is confounding by cluster or informative cluster size. BMC Med Res Methodol. 2021;21(1):1–14.CrossRef
7.
go back to reference Mitani AA, Kaye EK, Nelson KP. Accounting for drop-out using inverse probability censoring weights in longitudinal clustered data with informative cluster size. Ann Appl Stat. 2022;16(1):596–611.CrossRef Mitani AA, Kaye EK, Nelson KP. Accounting for drop-out using inverse probability censoring weights in longitudinal clustered data with informative cluster size. Ann Appl Stat. 2022;16(1):596–611.CrossRef
8.
go back to reference Seaman SR, Pavlou M, Copas AJ. Methods for observed-cluster inference when cluster size is informative: a review and clarifications. Biometrics. 2014;70(2):449–56.CrossRefPubMedPubMedCentral Seaman SR, Pavlou M, Copas AJ. Methods for observed-cluster inference when cluster size is informative: a review and clarifications. Biometrics. 2014;70(2):449–56.CrossRefPubMedPubMedCentral
9.
go back to reference Williamson JM, Datta S, Satten GA. Marginal analyses of clustered data when cluster size is informative. Biometrics. 2003;59(1):36–42.CrossRefPubMed Williamson JM, Datta S, Satten GA. Marginal analyses of clustered data when cluster size is informative. Biometrics. 2003;59(1):36–42.CrossRefPubMed
10.
go back to reference Benhin E, Rao JNK, Scott AJ. Mean estimating equation approach to analysing cluster-correlated data with nonignorable cluster sizes. Biometrika. 2005;92(2):435–50.CrossRef Benhin E, Rao JNK, Scott AJ. Mean estimating equation approach to analysing cluster-correlated data with nonignorable cluster sizes. Biometrika. 2005;92(2):435–50.CrossRef
11.
go back to reference Mitani AA, Kaye EK, Nelson KP. Marginal analysis of ordinal clustered longitudinal data with informative cluster size. Biometrics. 2019;75(3):938–49.CrossRefPubMedPubMedCentral Mitani AA, Kaye EK, Nelson KP. Marginal analysis of ordinal clustered longitudinal data with informative cluster size. Biometrics. 2019;75(3):938–49.CrossRefPubMedPubMedCentral
12.
go back to reference Schafer JL. Analysis of incomplete multivariate data. London: Chapman & Hall/CRC; 1997. Schafer JL. Analysis of incomplete multivariate data. London: Chapman & Hall/CRC; 1997.
13.
go back to reference Little RJ, Rubin DB. Statistical analysis with missing data. 2nd ed. New York: John Wiley & Sons; 2002. Little RJ, Rubin DB. Statistical analysis with missing data. 2nd ed. New York: John Wiley & Sons; 2002.
14.
go back to reference Rubin DB. Multiple imputation for nonresponse in surveys. New York: John Wiley & Sons; 2004. Rubin DB. Multiple imputation for nonresponse in surveys. New York: John Wiley & Sons; 2004.
15.
go back to reference Horton NJ, Lipsitz SR, Parzen M. A potential for bias when rounding in multiple imputation. Am Stat. 2003;57(4):229–32.CrossRef Horton NJ, Lipsitz SR, Parzen M. A potential for bias when rounding in multiple imputation. Am Stat. 2003;57(4):229–32.CrossRef
16.
go back to reference van Buuren S. Flexible Imputation of Missing Data. 2nd ed. London: Chapman and Hall/CRC; 2018. van Buuren S. Flexible Imputation of Missing Data. 2nd ed. London: Chapman and Hall/CRC; 2018.
17.
go back to reference Novo A. Schafer J. norm: Analysis of Multivariate Normal Datasets with Missing Values. R package version 1.0-10.0. 2022. Novo A. Schafer J. norm: Analysis of Multivariate Normal Datasets with Missing Values. R package version 1.0-10.0. 2022.
18.
go back to reference Harding T, Tusell F, Schafer J. cat: Analysis of categorical-variable datasets with missing values. R package version 0.0-7. 2012. Harding T, Tusell F, Schafer J. cat: Analysis of categorical-variable datasets with missing values. R package version 0.0-7. 2012.
19.
go back to reference Schafer J. mix: Estimation/multiple Imputation for Mixed Categorical and Continuous Data. R package version 1.0-11. 2010. Schafer J. mix: Estimation/multiple Imputation for Mixed Categorical and Continuous Data. R package version 1.0-11. 2010.
20.
go back to reference Zhao J, Schafer J. pan: Multiple imputation for multivariate panel or clustered data. R package version 1.6; 2018. Zhao J, Schafer J. pan: Multiple imputation for multivariate panel or clustered data. R package version 1.6; 2018.
21.
go back to reference Quartagno M, Grund S, Carpenter J. Jomo: a flexible package for two-level joint modelling multiple imputation. R J. 2019;11(2):205–28. Quartagno M, Grund S, Carpenter J. Jomo: a flexible package for two-level joint modelling multiple imputation. R J. 2019;11(2):205–28.
22.
go back to reference Carpenter JR, Goldstein H, Kenward MG. REALCOM-IMPUTE software for multilevel multiple imputation with mixed response types. J Stat Softw. 2011;45(5):1–14.CrossRef Carpenter JR, Goldstein H, Kenward MG. REALCOM-IMPUTE software for multilevel multiple imputation with mixed response types. J Stat Softw. 2011;45(5):1–14.CrossRef
23.
go back to reference Van Buuren S, Groothuis-Oudshoorn K. Mice: Multivariate imputation by chained equations in R. J Stat Softw. 2011;45(3):1–67. Van Buuren S, Groothuis-Oudshoorn K. Mice: Multivariate imputation by chained equations in R. J Stat Softw. 2011;45(3):1–67.
24.
go back to reference Audigier V, Resche-Rigon M. Micemd: Multiple imputation by chained equations with multilevel data. R package; 2017. Audigier V, Resche-Rigon M. Micemd: Multiple imputation by chained equations with multilevel data. R package; 2017.
25.
go back to reference Robitzsch A, Grund S, Henke T. Miceadds: some additional multiple imputation functions, especially for ‘mice’. R package version 1.7–8. 2016. Robitzsch A, Grund S, Henke T. Miceadds: some additional multiple imputation functions, especially for ‘mice’. R package version 1.7–8. 2016.
26.
go back to reference Enders CK, Keller BT, Levy R. A fully conditional specification approach to multilevel imputation of categorical and continuous variables. Psychol Methods. 2018;23(2):298–317.CrossRefPubMed Enders CK, Keller BT, Levy R. A fully conditional specification approach to multilevel imputation of categorical and continuous variables. Psychol Methods. 2018;23(2):298–317.CrossRefPubMed
27.
go back to reference Enders CK, Mistler SA, Keller BT. Multilevel multiple imputation: A review and evaluation of joint modeling and chained equations imputation. Psychol Methods. 2016;21(2):222-40.CrossRefPubMed Enders CK, Mistler SA, Keller BT. Multilevel multiple imputation: A review and evaluation of joint modeling and chained equations imputation. Psychol Methods. 2016;21(2):222-40.CrossRefPubMed
28.
go back to reference Audigier V, White IR, Jolani S, Debray TP, Quartagno M, Carpenter J, et al. Multiple imputation for multilevel data with continuous and binary variables. Stat Sci. 2018;33(2):160–83.CrossRef Audigier V, White IR, Jolani S, Debray TP, Quartagno M, Carpenter J, et al. Multiple imputation for multilevel data with continuous and binary variables. Stat Sci. 2018;33(2):160–83.CrossRef
29.
go back to reference Wijesuriya R, Moreno-Betancur M, Carlin J, De Silva AP, Lee KJ. Multiple imputation approaches for handling incomplete three-level data with time-varying cluster-memberships. Stat Med. 2022;41(22):4385-402. Wijesuriya R, Moreno-Betancur M, Carlin J, De Silva AP, Lee KJ. Multiple imputation approaches for handling incomplete three-level data with time-varying cluster-memberships. Stat Med. 2022;41(22):4385-402.
30.
go back to reference Kombo AY, Mwambi H, Molenberghs G. Multiple imputation for ordinal longitudinal data with monotone missing data patterns. J Appl Stat. 2017;44(2):270–87.CrossRef Kombo AY, Mwambi H, Molenberghs G. Multiple imputation for ordinal longitudinal data with monotone missing data patterns. J Appl Stat. 2017;44(2):270–87.CrossRef
31.
go back to reference Kapur KK, Glass RL, Loftus ER, Alman JE, Feller RP. The Veterans Administration longitudinal study of oral health and disease: methodology and preliminary findings. Aging Hum Dev. 1972;3(1):125–37.CrossRef Kapur KK, Glass RL, Loftus ER, Alman JE, Feller RP. The Veterans Administration longitudinal study of oral health and disease: methodology and preliminary findings. Aging Hum Dev. 1972;3(1):125–37.CrossRef
32.
33.
go back to reference Gamonal J, Mendoza C, Espinoza I, Munoz A, Urzua I, Aranda W, et al. Clinical attachment loss in Chilean adult population: first Chilean national dental examination survey. J Periodontol. 2010;81(10):1403–10.CrossRefPubMed Gamonal J, Mendoza C, Espinoza I, Munoz A, Urzua I, Aranda W, et al. Clinical attachment loss in Chilean adult population: first Chilean national dental examination survey. J Periodontol. 2010;81(10):1403–10.CrossRefPubMed
34.
go back to reference Fitzmaurice G, Davidian M, Verbeke G, Molenberghs G. Longitudinal data analysis. London: Chapman & Hall/CRC; 2008. Fitzmaurice G, Davidian M, Verbeke G, Molenberghs G. Longitudinal data analysis. London: Chapman & Hall/CRC; 2008.
35.
go back to reference Kenward MG, Lesaffre E, Molenberghs G. An Application of Maximum Likelihood and Generalized Estimating Equations to the Analysis of Ordinal Data from a Longitudinal Study with Cases Missing at Random. Biometrics. 1994;50(4):945–53.CrossRefPubMed Kenward MG, Lesaffre E, Molenberghs G. An Application of Maximum Likelihood and Generalized Estimating Equations to the Analysis of Ordinal Data from a Longitudinal Study with Cases Missing at Random. Biometrics. 1994;50(4):945–53.CrossRefPubMed
36.
go back to reference Quartagno M, Carpenter JR. Multiple imputation for discrete data: Evaluation of the joint latent normal model. Biom J. 2019;61(4):1003–19.PubMedPubMedCentral Quartagno M, Carpenter JR. Multiple imputation for discrete data: Evaluation of the joint latent normal model. Biom J. 2019;61(4):1003–19.PubMedPubMedCentral
37.
go back to reference Sterne JA, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, et al. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ. 2009;338:157–60. Sterne JA, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, et al. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ. 2009;338:157–60.
38.
go back to reference Parzen M, Ghosh S, Lipsitz S, Sinha D, Fitzmaurice GM, Mallick BK, et al. A generalized linear mixed model for longitudinal binary data with a marginal logit link function. Ann Appl Stat. 2011;5(1):449-67.CrossRefPubMedPubMedCentral Parzen M, Ghosh S, Lipsitz S, Sinha D, Fitzmaurice GM, Mallick BK, et al. A generalized linear mixed model for longitudinal binary data with a marginal logit link function. Ann Appl Stat. 2011;5(1):449-67.CrossRefPubMedPubMedCentral
39.
40.
41.
go back to reference Lamster IB, Pagan M. Periodontal disease and the metabolic syndrome. Int Dental J. 2017;67(2):67–77.CrossRef Lamster IB, Pagan M. Periodontal disease and the metabolic syndrome. Int Dental J. 2017;67(2):67–77.CrossRef
42.
go back to reference Huang Y, Leroux B. Informative cluster sizes for subcluster-level covariates and weighted generalized estimating equations. Biometrics. 2011;67(3):843–51.CrossRefPubMedPubMedCentral Huang Y, Leroux B. Informative cluster sizes for subcluster-level covariates and weighted generalized estimating equations. Biometrics. 2011;67(3):843–51.CrossRefPubMedPubMedCentral
Metadata
Title
Multiple imputation methods for missing multilevel ordinal outcomes
Authors
Mei Dong
Aya Mitani
Publication date
01-12-2023
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2023
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/s12874-023-01909-5

Other articles of this Issue 1/2023

BMC Medical Research Methodology 1/2023 Go to the issue