Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2021

Open Access 01-12-2021 | Technical advance

Data quality assessment and subsampling strategies to correct distributional bias in prevalence studies

Authors: A. D’Ambrosio, J. Garlasco, F. Quattrocolo, C. Vicentini, C. M. Zotti

Published in: BMC Medical Research Methodology | Issue 1/2021

Login to get access

Abstract

Background

Healthcare-associated infections (HAIs) represent a major Public Health issue. Hospital-based prevalence studies are a common tool of HAI surveillance, but data quality problems and non-representativeness can undermine their reliability.

Methods

This study proposes three algorithms that, given a convenience sample and variables relevant for the outcome of the study, select a subsample with specific distributional characteristics, boosting either representativeness (Probability and Distance procedures) or risk factors’ balance (Uniformity procedure). A “Quality Score” (QS) was also developed to grade sampled units according to data completeness and reliability.
The methodologies were evaluated through bootstrapping on a convenience sample of 135 hospitals collected during the 2016 Italian Point Prevalence Survey (PPS) on HAIs.

Results

The QS highlighted wide variations in data quality among hospitals (median QS 52.9 points, range 7.98–628, lower meaning better quality), with most problems ascribable to ward and hospital-related data reporting. Both Distance and Probability procedures produced subsamples with lower distributional bias (Log-likelihood score increased from 7.3 to 29 points). The Uniformity procedure increased the homogeneity of the sample characteristics (e.g., − 58.4% in geographical variability).
The procedures selected hospitals with higher data quality, especially the Probability procedure (lower QS in 100% of bootstrap simulations). The Distance procedure produced lower HAI prevalence estimates (6.98% compared to 7.44% in the convenience sample), more in line with the European median.

Conclusions

The QS and the subsampling procedures proposed in this study could represent effective tools to improve the quality of prevalence studies, decreasing the biases that can arise due to non-probabilistic sample collection.
Appendix
Available only for authorised users
Literature
2.
go back to reference Bianco A, Capano MS, Mascaro V, Pileggi C, Pavia M. Prospective surveillance of healthcare-associated infections and patterns of antimicrobial resistance of pathogens in an Italian intensive care unit. Antimicrob Resist Infect Control. 2018;7(1). https://doi.org/10.1186/s13756-018-0337-x. Bianco A, Capano MS, Mascaro V, Pileggi C, Pavia M. Prospective surveillance of healthcare-associated infections and patterns of antimicrobial resistance of pathogens in an Italian intensive care unit. Antimicrob Resist Infect Control. 2018;7(1). https://​doi.​org/​10.​1186/​s13756-018-0337-x.
5.
go back to reference Global antimicrobial resistance surveillance system (GLASS) report: early implementation 2016-2017. Geneva: World Health Organization; 2017. Licence: CC BY-NC-SA 3.0 IGO. Global antimicrobial resistance surveillance system (GLASS) report: early implementation 2016-2017. Geneva: World Health Organization; 2017. Licence: CC BY-NC-SA 3.0 IGO.
6.
go back to reference ECDC. European Centre for Disease Prevention and Control. Technical document. Point prevalence survey of healthcare-associated infections and antimicrobial use in European acute care hospitals. Protocol version 5.3. Stockholm: ECDC; 2016. https://doi.org/10.2900/374985.CrossRef ECDC. European Centre for Disease Prevention and Control. Technical document. Point prevalence survey of healthcare-associated infections and antimicrobial use in European acute care hospitals. Protocol version 5.3. Stockholm: ECDC; 2016. https://​doi.​org/​10.​2900/​374985.CrossRef
7.
go back to reference L. D. H. Carl Suetens, Susan Hopkins, Jana Kolman, “European Centre for Disease Prevention and Control. Point prevalence survey of healthcareassociated infections and antimicrobial use in European acute care hospitals,” Stockholm, 2013. doi: https://doi.org/10.2900/86011. L. D. H. Carl Suetens, Susan Hopkins, Jana Kolman, “European Centre for Disease Prevention and Control. Point prevalence survey of healthcareassociated infections and antimicrobial use in European acute care hospitals,” Stockholm, 2013. doi: https://​doi.​org/​10.​2900/​86011.
10.
go back to reference “European Centre for Disease Prevention and Control HelicsWin.Net 2.3 − user manual.” ECDC, Stockholm, 2016. “European Centre for Disease Prevention and Control HelicsWin.Net 2.3 − user manual.” ECDC, Stockholm, 2016.
12.
13.
go back to reference Hilbe JM. “Can binary logistic models be overdispersed?” Jet Propulsion Laboratory, California Institute of Technology and Arizona State University; 2013. Hilbe JM. “Can binary logistic models be overdispersed?” Jet Propulsion Laboratory, California Institute of Technology and Arizona State University; 2013.
14.
go back to reference Efron B, Tibshirani RJ. An introduction to bootstrapping. 1st ed. New York: Chapman & Hall; 1993.CrossRef Efron B, Tibshirani RJ. An introduction to bootstrapping. 1st ed. New York: Chapman & Hall; 1993.CrossRef
16.
go back to reference Kruschke J. Doing Bayesian data analysis: a tutorial introduction with R JAGS, and Stan; 2015. Kruschke J. Doing Bayesian data analysis: a tutorial introduction with R JAGS, and Stan; 2015.
22.
go back to reference Fielding N, Lee RM, Blank G. The SAGE handbook of online research methods. London: SAGE Publications Ltd.; 2008.CrossRef Fielding N, Lee RM, Blank G. The SAGE handbook of online research methods. London: SAGE Publications Ltd.; 2008.CrossRef
26.
go back to reference Little RJA, Rubin DB. Statistical Analysis with Missing Data. 3rd ed. Hoboken: Wiley; 2020. Little RJA, Rubin DB. Statistical Analysis with Missing Data. 3rd ed. Hoboken: Wiley; 2020.
27.
go back to reference Mason AJ. Bayesian methods for modelling non-random missing data mechanisms in longitudinal studies. In: Technical report. London: Imperial College; 2010. Mason AJ. Bayesian methods for modelling non-random missing data mechanisms in longitudinal studies. In: Technical report. London: Imperial College; 2010.
35.
go back to reference L. D. H. Carl Suetens, Susan Hopkins, Jana Kolman, “European Centre for Disease Prevention and Control. Point prevalence survey of healthcare-associated infections and antimicrobial use in European acute care hospitals,” Stockholm, 2013. doi: https://doi.org/10.2900/86011. L. D. H. Carl Suetens, Susan Hopkins, Jana Kolman, “European Centre for Disease Prevention and Control. Point prevalence survey of healthcare-associated infections and antimicrobial use in European acute care hospitals,” Stockholm, 2013. doi: https://​doi.​org/​10.​2900/​86011.
40.
go back to reference C. M. Pirkle, Y. Y. Wu, M. V. Zunzunegui, and J. F. Gómez, “Model-based recursive partitioning to identify risk clusters for metabolic syndrome and its components: findings from the international mobility in aging study,” BMJ Open, 2018, doi: https://doi.org/10.1136/bmjopen-2017-018680. C. M. Pirkle, Y. Y. Wu, M. V. Zunzunegui, and J. F. Gómez, “Model-based recursive partitioning to identify risk clusters for metabolic syndrome and its components: findings from the international mobility in aging study,” BMJ Open, 2018, doi: https://​doi.​org/​10.​1136/​bmjopen-2017-018680.
Metadata
Title
Data quality assessment and subsampling strategies to correct distributional bias in prevalence studies
Authors
A. D’Ambrosio
J. Garlasco
F. Quattrocolo
C. Vicentini
C. M. Zotti
Publication date
01-12-2021
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2021
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/s12874-021-01277-y

Other articles of this Issue 1/2021

BMC Medical Research Methodology 1/2021 Go to the issue