Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2015

Open Access 01-12-2015 | Research article

Aiming for a representative sample: Simulating random versus purposive strategies for hospital selection

Authors: Loan R. van Hoeven, Mart P. Janssen, Kit C. B. Roes, Hendrik Koffijberg

Published in: BMC Medical Research Methodology | Issue 1/2015

Login to get access

Abstract

Background

A ubiquitous issue in research is that of selecting a representative sample from the study population. While random sampling strategies are the gold standard, in practice, random sampling of participants is not always feasible nor necessarily the optimal choice. In our case, a selection must be made of 12 hospitals (out of 89 Dutch hospitals in total). With this selection of 12 hospitals, it should be possible to estimate blood use in the remaining hospitals as well. In this paper, we evaluate both random and purposive strategies for the case of estimating blood use in Dutch hospitals.

Methods

Available population-wide data on hospital blood use and number of hospital beds are used to simulate five sampling strategies: (1) select only the largest hospitals, (2) select the largest and the smallest hospitals (‘maximum variation’), (3) select hospitals randomly, (4) select hospitals from as many different geographic regions as possible, (5) select hospitals from only two regions. Simulations of each strategy result in different selections of hospitals, that are each used to estimate blood use in the remaining hospitals. The estimates are compared to the actual population values; the subsequent prediction errors are used to indicate the quality of the sampling strategy.

Results

The strategy leading to the lowest prediction error in the case study was maximum variation sampling, followed by random, regional variation and two-region sampling, with sampling the largest hospitals resulting in the worst performance. Maximum variation sampling led to a hospital level prediction error of 15 %, whereas random sampling led to a prediction error of 19 % (95 % CI 17 %-26 %). While lowering the sample size reduced the differences between maximum variation and the random strategies, increasing sample size to n = 18 did not change the ranking of the strategies and led to only slightly better predictions.

Conclusions

The optimal strategy for estimating blood use was maximum variation sampling. When proxy data are available, it is possible to evaluate random and purposive sampling strategies using simulations before the start of the study. The results enable researchers to make a more educated choice of an appropriate sampling strategy.
Appendix
Available only for authorised users
Literature
1.
go back to reference Levy PS, Lemeshow S. The population and the sample. In: Sampling of Populations: Methods and applications. 4th ed. New York, USA: John Wiley and Sons; 2008. p. 11–42.CrossRef Levy PS, Lemeshow S. The population and the sample. In: Sampling of Populations: Methods and applications. 4th ed. New York, USA: John Wiley and Sons; 2008. p. 11–42.CrossRef
3.
go back to reference Kish L. New paradigms (models) for probability sampling. Survey methodology. 2002;28(1):31–4. Kish L. New paradigms (models) for probability sampling. Survey methodology. 2002;28(1):31–4.
4.
go back to reference Tiwari N, Chilwal A. On Minimum Variance Optimal Controlled Sampling: A Simplified Approach. Journal of Statistical Theory and Practice. 2014;8(4):692–706.CrossRef Tiwari N, Chilwal A. On Minimum Variance Optimal Controlled Sampling: A Simplified Approach. Journal of Statistical Theory and Practice. 2014;8(4):692–706.CrossRef
5.
go back to reference Raaijmakers M, Koffijberg H, Posthumus J, van Hout B, van Engeland H, Matthys W. Assessing performance of a randomized versus a non-randomized study design. Contemp Clin Trials. 2008;29:293–303.CrossRefPubMed Raaijmakers M, Koffijberg H, Posthumus J, van Hout B, van Engeland H, Matthys W. Assessing performance of a randomized versus a non-randomized study design. Contemp Clin Trials. 2008;29:293–303.CrossRefPubMed
6.
go back to reference Topp L, Barker B, Degenhardt L. The external validity of results derived from ecstasy users recruited using purposive sampling strategies. Drug Alcohol Depend. 2004;73:33–40.CrossRefPubMed Topp L, Barker B, Degenhardt L. The external validity of results derived from ecstasy users recruited using purposive sampling strategies. Drug Alcohol Depend. 2004;73:33–40.CrossRefPubMed
8.
go back to reference Wang RY, Strong DM. Beyond accuracy: What data quality means to data consumers. J manag inf syst. 1996;5–33. Wang RY, Strong DM. Beyond accuracy: What data quality means to data consumers. J manag inf syst. 1996;5–33.
9.
go back to reference Kahn MG, Raebel MA, Glanz JM, Riedlinger K, Steiner JF. A pragmatic framework for single-site and multisite data quality assessment in electronic health record-based clinical research. Med Care. 2012;50:S21–9.CrossRefPubMed Kahn MG, Raebel MA, Glanz JM, Riedlinger K, Steiner JF. A pragmatic framework for single-site and multisite data quality assessment in electronic health record-based clinical research. Med Care. 2012;50:S21–9.CrossRefPubMed
10.
go back to reference O'Muircheartaigh C, Hedges LV. Generalizing from unrepresentative experiments: a stratified propensity score approach. J R Stat Soc: Ser C Appl Stat. 2014;63(2):195–210.CrossRef O'Muircheartaigh C, Hedges LV. Generalizing from unrepresentative experiments: a stratified propensity score approach. J R Stat Soc: Ser C Appl Stat. 2014;63(2):195–210.CrossRef
11.
go back to reference Aronow PM, Middleton JA. A class of unbiased estimators of the average treatment effect in randomized experiments. Journal of Causal Inference. 2013;1:135–54.CrossRef Aronow PM, Middleton JA. A class of unbiased estimators of the average treatment effect in randomized experiments. Journal of Causal Inference. 2013;1:135–54.CrossRef
12.
go back to reference Adhya S, Banerjee T, Chattopadhyay G. Inference on Polychotomous Responses in Finite Populations. Scand J Stat. 2011;38(4):788–800.CrossRef Adhya S, Banerjee T, Chattopadhyay G. Inference on Polychotomous Responses in Finite Populations. Scand J Stat. 2011;38(4):788–800.CrossRef
13.
go back to reference Maghera A, Kahlke P, Lau A, Zeng Y, Hoskins C, Corbett T, et al. You are how you recruit: a cohort and randomized controlled trial of recruitment strategies. BMC Med Res Methodol. 2014;14(1):111.CrossRefPubMedPubMedCentral Maghera A, Kahlke P, Lau A, Zeng Y, Hoskins C, Corbett T, et al. You are how you recruit: a cohort and randomized controlled trial of recruitment strategies. BMC Med Res Methodol. 2014;14(1):111.CrossRefPubMedPubMedCentral
14.
go back to reference Sherman KJ, Hawkes RJ, Ichikawa L, Cherkin DC, Deyo RA, Avins AL, et al. Comparing recruitment strategies in a study of acupuncture for chronic back pain. BMC Med Res Methodol. 2009;9(1):69.CrossRefPubMedPubMedCentral Sherman KJ, Hawkes RJ, Ichikawa L, Cherkin DC, Deyo RA, Avins AL, et al. Comparing recruitment strategies in a study of acupuncture for chronic back pain. BMC Med Res Methodol. 2009;9(1):69.CrossRefPubMedPubMedCentral
15.
go back to reference Ikeda N, Shibuya K, Hashimoto H. Improving population health measurement in national household surveys: a simulation study of the sample design of the comprehensive survey of living conditions of the people on health and welfare in Japan. J Epidemiol. 2011;21:385–90.CrossRefPubMedPubMedCentral Ikeda N, Shibuya K, Hashimoto H. Improving population health measurement in national household surveys: a simulation study of the sample design of the comprehensive survey of living conditions of the people on health and welfare in Japan. J Epidemiol. 2011;21:385–90.CrossRefPubMedPubMedCentral
16.
go back to reference Albert CH, Yoccoz NG, Edwards Jr TC, Graham CH, Zimmermann NE, Thuiller W. Sampling in ecology and evolution – bridging the gap between theory and practice. Ecography. 2010;33:1028–37.CrossRef Albert CH, Yoccoz NG, Edwards Jr TC, Graham CH, Zimmermann NE, Thuiller W. Sampling in ecology and evolution – bridging the gap between theory and practice. Ecography. 2010;33:1028–37.CrossRef
17.
go back to reference Kruskal W, Mosteller F, Representative Sampling I. Scientific Literature, Excluding Statistics. Int Stat Rev / Revue Internationale de Statistique. 1980;47(2):111–27. Kruskal W, Mosteller F, Representative Sampling I. Scientific Literature, Excluding Statistics. Int Stat Rev / Revue Internationale de Statistique. 1980;47(2):111–27.
19.
go back to reference Sanquin Blood Bank. Number of issued blood products per hospital 2013. Accessed in 2014. Sanquin Blood Bank. Number of issued blood products per hospital 2013. Accessed in 2014.
23.
go back to reference Edwards Jr TC, Cutler DR, Zimmermann NE, Geiser L, Moisen GG. Effects of sample survey design on the accuracy of classification tree models in species distribution models. Ecol Model. 2006;199(2):132–41.CrossRef Edwards Jr TC, Cutler DR, Zimmermann NE, Geiser L, Moisen GG. Effects of sample survey design on the accuracy of classification tree models in species distribution models. Ecol Model. 2006;199(2):132–41.CrossRef
24.
go back to reference Hirzel A, Guisan A. Which is the optimal sampling strategy for habitat suitability modeling? Ecol Model. 2002;157:329–39.CrossRef Hirzel A, Guisan A. Which is the optimal sampling strategy for habitat suitability modeling? Ecol Model. 2002;157:329–39.CrossRef
Metadata
Title
Aiming for a representative sample: Simulating random versus purposive strategies for hospital selection
Authors
Loan R. van Hoeven
Mart P. Janssen
Kit C. B. Roes
Hendrik Koffijberg
Publication date
01-12-2015
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2015
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/s12874-015-0089-8

Other articles of this Issue 1/2015

BMC Medical Research Methodology 1/2015 Go to the issue