Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2006

Open Access 01-12-2006 | Research article

Dealing with missing data in a multi-question depression scale: a comparison of imputation methods

Authors: Fiona M Shrive, Heather Stuart, Hude Quan, William A Ghali

Published in: BMC Medical Research Methodology | Issue 1/2006

Login to get access

Abstract

Background

Missing data present a challenge to many research projects. The problem is often pronounced in studies utilizing self-report scales, and literature addressing different strategies for dealing with missing data in such circumstances is scarce. The objective of this study was to compare six different imputation techniques for dealing with missing data in the Zung Self-reported Depression scale (SDS).

Methods

1580 participants from a surgical outcomes study completed the SDS. The SDS is a 20 question scale that respondents complete by circling a value of 1 to 4 for each question. The sum of the responses is calculated and respondents are classified as exhibiting depressive symptoms when their total score is over 40. Missing values were simulated by randomly selecting questions whose values were then deleted (a missing completely at random simulation). Additionally, a missing at random and missing not at random simulation were completed. Six imputation methods were then considered; 1) multiple imputation, 2) single regression, 3) individual mean, 4) overall mean, 5) participant's preceding response, and 6) random selection of a value from 1 to 4. For each method, the imputed mean SDS score and standard deviation were compared to the population statistics. The Spearman correlation coefficient, percent misclassified and the Kappa statistic were also calculated.

Results

When 10% of values are missing, all the imputation methods except random selection produce Kappa statistics greater than 0.80 indicating 'near perfect' agreement. MI produces the most valid imputed values with a high Kappa statistic (0.89), although both single regression and individual mean imputation also produced favorable results. As the percent of missing information increased to 30%, or when unbalanced missing data were introduced, MI maintained a high Kappa statistic. The individual mean and single regression method produced Kappas in the 'substantial agreement' range (0.76 and 0.74 respectively).

Conclusion

Multiple imputation is the most accurate method for dealing with missing data in most of the missind data scenarios we assessed for the SDS. Imputing the individual's mean is also an appropriate and simple method for dealing with missing data that may be more interpretable to the majority of medical readers. Researchers should consider conducting methodological assessments such as this one when confronted with missing data. The optimal method should balance validity, ease of interpretability for readers, and analysis expertise of the research team.
Appendix
Available only for authorised users
Literature
1.
go back to reference WWK Z, NC D: A Self-Rating Depression Scale. Archives of General Psychiatry. 1965, 12: 63-70.CrossRef WWK Z, NC D: A Self-Rating Depression Scale. Archives of General Psychiatry. 1965, 12: 63-70.CrossRef
2.
go back to reference N S, T B, eds.: Self-Rating Depression Scale and Depression Status INventory. Assessment of Depression. 1986, Berlin, Springer, 221-231. N S, T B, eds.: Self-Rating Depression Scale and Depression Status INventory. Assessment of Depression. 1986, Berlin, Springer, 221-231.
3.
go back to reference TA H, R B: Factors affecting response rates to mailed questionnaires: A Quantitative analysis of the published literature. American Sociological Review. 1978, 43: 447-462. 10.2307/2094771.CrossRef TA H, R B: Factors affecting response rates to mailed questionnaires: A Quantitative analysis of the published literature. American Sociological Review. 1978, 43: 447-462. 10.2307/2094771.CrossRef
4.
go back to reference Statistical Analysis System Version 8.1. 2000, Cary, NC, SAS Instititute Statistical Analysis System Version 8.1. 2000, Cary, NC, SAS Instititute
5.
go back to reference DB R: Inference and Missing Data. Biometrika. 1976, 63: 581-592. 10.2307/2335739.CrossRef DB R: Inference and Missing Data. Biometrika. 1976, 63: 581-592. 10.2307/2335739.CrossRef
6.
go back to reference JR L, CG K: The measurement of observer agreement for categorical data. Biometrics. 1977, 33: 159-174. 10.2307/2529360.CrossRef JR L, CG K: The measurement of observer agreement for categorical data. Biometrics. 1977, 33: 159-174. 10.2307/2529360.CrossRef
7.
go back to reference RG D, CV K: Missing Data in Likert Ratings: A Comparison of Replacement Methods. The Journal of General Psychology. 1998, 125: 175-191.CrossRef RG D, CV K: Missing Data in Likert Ratings: A Comparison of Replacement Methods. The Journal of General Psychology. 1998, 125: 175-191.CrossRef
8.
go back to reference G G: Imputation of missing values in the case of a multiple instrument measuring alcohol consumption. Statistics in Medicine. 2001, 20: 2369-2381. 10.1002/sim.837.CrossRef G G: Imputation of missing values in the case of a multiple instrument measuring alcohol consumption. Statistics in Medicine. 2001, 20: 2369-2381. 10.1002/sim.837.CrossRef
9.
go back to reference G H, P E: Imputing Cross-sectional missing data: Comparison of common techniques. Australian and New Zealand Journal of Psychiatry. 2005, 39: 583-590. 10.1111/j.1440-1614.2005.01630.x.CrossRef G H, P E: Imputing Cross-sectional missing data: Comparison of common techniques. Australian and New Zealand Journal of Psychiatry. 2005, 39: 583-590. 10.1111/j.1440-1614.2005.01630.x.CrossRef
10.
go back to reference CD C, DK N: Methods for Addressing MIssing Data in Psychiatric and Developmental Research. J Am Acad Child Adolesc Psychiatry. 2005, 44: 1230-1239. 10.1097/01.chi.0000181044.06337.6f.CrossRef CD C, DK N: Methods for Addressing MIssing Data in Psychiatric and Developmental Research. J Am Acad Child Adolesc Psychiatry. 2005, 44: 1230-1239. 10.1097/01.chi.0000181044.06337.6f.CrossRef
11.
go back to reference DL F, DF C: Functional Assessment of Cancer Therapy (FACT-G): Non-response to Individual Questions. Quality of Life Research. 1996, 5: 321-329. 10.1007/BF00433916.CrossRef DL F, DF C: Functional Assessment of Cancer Therapy (FACT-G): Non-response to Individual Questions. Quality of Life Research. 1996, 5: 321-329. 10.1007/BF00433916.CrossRef
12.
go back to reference PM F, D C, D M: Incomplete Quality of Life Data in Randomized Trials: Missing Items. Statistics in Medicine. 1998, 17: 679-696. 10.1002/(SICI)1097-0258(19980315/15)17:5/7<679::AID-SIM814>3.0.CO;2-X.CrossRef PM F, D C, D M: Incomplete Quality of Life Data in Randomized Trials: Missing Items. Statistics in Medicine. 1998, 17: 679-696. 10.1002/(SICI)1097-0258(19980315/15)17:5/7<679::AID-SIM814>3.0.CO;2-X.CrossRef
13.
go back to reference CM N, WA G, ML K, CD N, LD S: Dealing with missing data in observational health care outcome analyses. Journal of Clinical Epidemiology. 2000, 53: 377-383. 10.1016/S0895-4356(99)00181-X.CrossRef CM N, WA G, ML K, CD N, LD S: Dealing with missing data in observational health care outcome analyses. Journal of Clinical Epidemiology. 2000, 53: 377-383. 10.1016/S0895-4356(99)00181-X.CrossRef
14.
go back to reference JL S: Analysis of incomplete multivariate data. 1997, London, Chapman and Hall JL S: Analysis of incomplete multivariate data. 1997, London, Chapman and Hall
15.
go back to reference PD F, WA G, R B, CM N, PD G, ML K, Investigators APPROACH: Multiple imputation versus data enhancement for dealing with missing data in observational health care outcome analyses. J Clin Epidemiol. 2002, 55: 184-191. 10.1016/S0895-4356(01)00433-4.CrossRef PD F, WA G, R B, CM N, PD G, ML K, Investigators APPROACH: Multiple imputation versus data enhancement for dealing with missing data in observational health care outcome analyses. J Clin Epidemiol. 2002, 55: 184-191. 10.1016/S0895-4356(01)00433-4.CrossRef
Metadata
Title
Dealing with missing data in a multi-question depression scale: a comparison of imputation methods
Authors
Fiona M Shrive
Heather Stuart
Hude Quan
William A Ghali
Publication date
01-12-2006
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2006
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/1471-2288-6-57

Other articles of this Issue 1/2006

BMC Medical Research Methodology 1/2006 Go to the issue