Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2015

Open Access 01-12-2015 | Research article

Using interviewer random effects to remove selection bias from HIV prevalence estimates

Authors: Mark E McGovern, Till Bärnighausen, Joshua A Salomon, David Canning

Published in: BMC Medical Research Methodology | Issue 1/2015

Login to get access

Abstract

Background

Selection bias in HIV prevalence estimates occurs if non-participation in testing is correlated with HIV status. Longitudinal data suggests that individuals who know or suspect they are HIV positive are less likely to participate in testing in HIV surveys, in which case methods to correct for missing data which are based on imputation and observed characteristics will produce biased results.

Methods

The identity of the HIV survey interviewer is typically associated with HIV testing participation, but is unlikely to be correlated with HIV status. Interviewer identity can thus be used as a selection variable allowing estimation of Heckman-type selection models. These models produce asymptotically unbiased HIV prevalence estimates, even when non-participation is correlated with unobserved characteristics, such as knowledge of HIV status. We introduce a new random effects method to these selection models which overcomes non-convergence caused by collinearity, small sample bias, and incorrect inference in existing approaches. Our method is easy to implement in standard statistical software, and allows the construction of bootstrapped standard errors which adjust for the fact that the relationship between testing and HIV status is uncertain and needs to be estimated.

Results

Using nationally representative data from the Demographic and Health Surveys, we illustrate our approach with new point estimates and confidence intervals (CI) for HIV prevalence among men in Ghana (2003) and Zambia (2007). In Ghana, we find little evidence of selection bias as our selection model gives an HIV prevalence estimate of 1.4% (95% CI 1.2% – 1.6%), compared to 1.6% among those with a valid HIV test. In Zambia, our selection model gives an HIV prevalence estimate of 16.3% (95% CI 11.0% - 18.4%), compared to 12.1% among those with a valid HIV test. Therefore, those who decline to test in Zambia are found to be more likely to be HIV positive.

Conclusions

Our approach corrects for selection bias in HIV prevalence estimates, is possible to implement even when HIV prevalence or non-participation is very high or very low, and provides a practical solution to account for both sampling and parameter uncertainty in the estimation of confidence intervals. The wide confidence intervals estimated in an example with high HIV prevalence indicate that it is difficult to correct statistically for the bias that may occur when a large proportion of people refuse to test.
Appendix
Available only for authorised users
Literature
1.
go back to reference Boerma JT, Ghys PD, Walker N: Estimates of HIV-1 prevalence from national population-based surveys as a new gold standard.Lancet 2003, 362:1929–31. 10.1016/S0140-6736(03)14967-7CrossRefPubMed Boerma JT, Ghys PD, Walker N: Estimates of HIV-1 prevalence from national population-based surveys as a new gold standard.Lancet 2003, 362:1929–31. 10.1016/S0140-6736(03)14967-7CrossRefPubMed
2.
go back to reference Hogan DR, Salomon JA, Canning D, Hammitt JK, Zaslavsky AM, Bärnighausen T: National HIV prevalence estimates for sub-Saharan Africa: controlling selection bias with Heckman-type selection models.Sex Transm Infect 2012,88(Suppl 2):i17–23. 10.1136/sextrans-2012-050636CrossRefPubMedPubMedCentral Hogan DR, Salomon JA, Canning D, Hammitt JK, Zaslavsky AM, Bärnighausen T: National HIV prevalence estimates for sub-Saharan Africa: controlling selection bias with Heckman-type selection models.Sex Transm Infect 2012,88(Suppl 2):i17–23. 10.1136/sextrans-2012-050636CrossRefPubMedPubMedCentral
3.
go back to reference Mishra V, Vaessen M, Boerma J, Arnold F, Way A, Barrere B, et al.: HIV testing in national population-based surveys: experience from the Demographic and Health Surveys.Bull World Health Organ 2006, 84:537–45. 10.2471/BLT.05.029520CrossRefPubMedPubMedCentral Mishra V, Vaessen M, Boerma J, Arnold F, Way A, Barrere B, et al.: HIV testing in national population-based surveys: experience from the Demographic and Health Surveys.Bull World Health Organ 2006, 84:537–45. 10.2471/BLT.05.029520CrossRefPubMedPubMedCentral
4.
go back to reference Marston M, Harriss K, Slaymaker E: Non-response bias in estimates of HIV prevalence due to the mobility of absentees in national population-based surveys: a study of nine national surveys.Sex Transm Infect 2008,84(Suppl 1):i71–7.CrossRefPubMedPubMedCentral Marston M, Harriss K, Slaymaker E: Non-response bias in estimates of HIV prevalence due to the mobility of absentees in national population-based surveys: a study of nine national surveys.Sex Transm Infect 2008,84(Suppl 1):i71–7.CrossRefPubMedPubMedCentral
5.
go back to reference Mishra V, Barrere B, Hong R, Khan S: Evaluation of bias in HIV seroprevalence estimates from national household surveys.Sex Transm Infect 2008,84(Suppl 1):i63–70.CrossRefPubMedPubMedCentral Mishra V, Barrere B, Hong R, Khan S: Evaluation of bias in HIV seroprevalence estimates from national household surveys.Sex Transm Infect 2008,84(Suppl 1):i63–70.CrossRefPubMedPubMedCentral
6.
go back to reference Bärnighausen T, Tanser F, Malaza A, Herbst K, Newell M: HIV status and participation in HIV surveillance in the era of antiretroviral treatment: a study of linked population-based and clinical data in rural South Africa.Trop Med Int Health 2012, 17:e103–10. 10.1111/j.1365-3156.2012.02928.xCrossRefPubMedPubMedCentral Bärnighausen T, Tanser F, Malaza A, Herbst K, Newell M: HIV status and participation in HIV surveillance in the era of antiretroviral treatment: a study of linked population-based and clinical data in rural South Africa.Trop Med Int Health 2012, 17:e103–10. 10.1111/j.1365-3156.2012.02928.xCrossRefPubMedPubMedCentral
7.
go back to reference Reniers G, Eaton J: Refusal bias in HIV prevalence estimates from nationally representative seroprevalence surveys.AIDS 2009, 23:621–9. 10.1097/QAD.0b013e3283269e13CrossRefPubMedPubMedCentral Reniers G, Eaton J: Refusal bias in HIV prevalence estimates from nationally representative seroprevalence surveys.AIDS 2009, 23:621–9. 10.1097/QAD.0b013e3283269e13CrossRefPubMedPubMedCentral
8.
go back to reference Floyd S, Molesworth A, Dube A, Crampin AC, Houben R, Chihana M, et al.: Underestimation of HIV prevalence in surveys when some people already know their status, and ways to reduce the bias.AIDS 2013, 27:233–42. 10.1097/QAD.0b013e32835848abCrossRefPubMed Floyd S, Molesworth A, Dube A, Crampin AC, Houben R, Chihana M, et al.: Underestimation of HIV prevalence in surveys when some people already know their status, and ways to reduce the bias.AIDS 2013, 27:233–42. 10.1097/QAD.0b013e32835848abCrossRefPubMed
9.
go back to reference Obare F: Nonresponse in repeat population-based voluntary counseling and testing for HIV in rural Malawi.Demography 2010, 47:651–65. 10.1353/dem.0.0115CrossRefPubMedPubMedCentral Obare F: Nonresponse in repeat population-based voluntary counseling and testing for HIV in rural Malawi.Demography 2010, 47:651–65. 10.1353/dem.0.0115CrossRefPubMedPubMedCentral
10.
go back to reference Conniffe D, O’Neill D: Efficient Probit Estimation with Partially Missing Covariates.Adv Econ 2011, 27:209–45. Conniffe D, O’Neill D: Efficient Probit Estimation with Partially Missing Covariates.Adv Econ 2011, 27:209–45.
11.
go back to reference Arpino B, Cao ED, Peracchi F: Using panel data for partial identification of human immunodeficiency virus prevalence when infection status is missing not at random.J R Stat Soc Ser A 2014, 177:587–606. 10.1111/rssa.12027CrossRef Arpino B, Cao ED, Peracchi F: Using panel data for partial identification of human immunodeficiency virus prevalence when infection status is missing not at random.J R Stat Soc Ser A 2014, 177:587–606. 10.1111/rssa.12027CrossRef
12.
go back to reference Hund L, Pagano M: Estimating HIV prevalence from surveys with low individual consent rates: annealing individual and pooled samples.Emerg Themes Epidemiol 2013, 10:2. 10.1186/1742-7622-10-2CrossRefPubMedPubMedCentral Hund L, Pagano M: Estimating HIV prevalence from surveys with low individual consent rates: annealing individual and pooled samples.Emerg Themes Epidemiol 2013, 10:2. 10.1186/1742-7622-10-2CrossRefPubMedPubMedCentral
13.
go back to reference Heckman JJ: Sample selection bias as a specification error.Econometrica 1979, 47:153–61. 10.2307/1912352CrossRef Heckman JJ: Sample selection bias as a specification error.Econometrica 1979, 47:153–61. 10.2307/1912352CrossRef
14.
go back to reference Vella F: Estimating models with sample selection bias: a survey.J Hum Resour 1998, 33:127–69. 10.2307/146317CrossRef Vella F: Estimating models with sample selection bias: a survey.J Hum Resour 1998, 33:127–69. 10.2307/146317CrossRef
15.
go back to reference Bärnighausen T, Bor J, Wandira-Kazibwe S, Canning D: Correcting HIV prevalence estimates for survey nonparticipation using Heckman-type selection models.Epidemiology 2011, 22:27–35. 10.1097/EDE.0b013e3181ffa201CrossRefPubMed Bärnighausen T, Bor J, Wandira-Kazibwe S, Canning D: Correcting HIV prevalence estimates for survey nonparticipation using Heckman-type selection models.Epidemiology 2011, 22:27–35. 10.1097/EDE.0b013e3181ffa201CrossRefPubMed
16.
go back to reference Janssens W, van der Gaag J, de Wit TFR, Tanović Z: Refusal bias in the estimation of HIV prevalence.Demography 2014, 51:1131–57. 10.1007/s13524-014-0290-0CrossRefPubMed Janssens W, van der Gaag J, de Wit TFR, Tanović Z: Refusal bias in the estimation of HIV prevalence.Demography 2014, 51:1131–57. 10.1007/s13524-014-0290-0CrossRefPubMed
17.
go back to reference Reniers G, Araya T, Berhane Y, Davey G, Sanders EJ: Implications of the HIV testing protocol for refusal bias in seroprevalence surveys.BMC Public Health 2009, 9:1–9. 10.1186/1471-2458-9-1CrossRef Reniers G, Araya T, Berhane Y, Davey G, Sanders EJ: Implications of the HIV testing protocol for refusal bias in seroprevalence surveys.BMC Public Health 2009, 9:1–9. 10.1186/1471-2458-9-1CrossRef
18.
go back to reference Clark SJ, Houle B: Validation, replication, and sensitivity testing of Heckman-type selection models to adjust estimates of HIV prevalence.PLoS One 2014, 9:e112563. 10.1371/journal.pone.0112563CrossRefPubMedPubMedCentral Clark SJ, Houle B: Validation, replication, and sensitivity testing of Heckman-type selection models to adjust estimates of HIV prevalence.PLoS One 2014, 9:e112563. 10.1371/journal.pone.0112563CrossRefPubMedPubMedCentral
19.
go back to reference Madden D: Sample selection versus two-part models revisited: the case of female smoking and drinking.J Health Econ 2008, 27:300–7. 10.1016/j.jhealeco.2007.07.001CrossRefPubMed Madden D: Sample selection versus two-part models revisited: the case of female smoking and drinking.J Health Econ 2008, 27:300–7. 10.1016/j.jhealeco.2007.07.001CrossRefPubMed
20.
go back to reference Clark SJ, Houle B: Evaluation of Heckman selection model method for correcting estimates of HIV prevalence from sample surveys via realistic simulation. In Center for Statistical Social Science, Working Paper No 120. University of Washington; 2012. Clark SJ, Houle B: Evaluation of Heckman selection model method for correcting estimates of HIV prevalence from sample surveys via realistic simulation. In Center for Statistical Social Science, Working Paper No 120. University of Washington; 2012.
21.
go back to reference Dubin JA, Rivers D: Selection bias in linear regression, logit and probit models.Sociol Methods Res 1989, 18:360–90. 10.1177/0049124189018002006CrossRef Dubin JA, Rivers D: Selection bias in linear regression, logit and probit models.Sociol Methods Res 1989, 18:360–90. 10.1177/0049124189018002006CrossRef
22.
go back to reference Van de Ven WP, Van Praag B: The demand for deductibles in private health insurance: a probit model with sample selection.J Econom 1981, 17:229–52. 10.1016/0304-4076(81)90028-2CrossRef Van de Ven WP, Van Praag B: The demand for deductibles in private health insurance: a probit model with sample selection.J Econom 1981, 17:229–52. 10.1016/0304-4076(81)90028-2CrossRef
23.
go back to reference Corsi DJ, Neuman M, Finlay JE, Subramanian S: Demographic and health surveys: a profile.Int J Epidemiol 2012, 41:1602–1613. 10.1093/ije/dys184CrossRefPubMed Corsi DJ, Neuman M, Finlay JE, Subramanian S: Demographic and health surveys: a profile.Int J Epidemiol 2012, 41:1602–1613. 10.1093/ije/dys184CrossRefPubMed
24.
go back to reference Greene W: The behaviour of the maximum likelihood estimator of limited dependent variable models in the presence of fixed effects.Econom J 2004, 7:98–119. 10.1111/j.1368-423X.2004.00123.xCrossRef Greene W: The behaviour of the maximum likelihood estimator of limited dependent variable models in the presence of fixed effects.Econom J 2004, 7:98–119. 10.1111/j.1368-423X.2004.00123.xCrossRef
25.
go back to reference Greenland S: Small-sample bias and corrections for conditional maximum-likelihood odds-ratio estimators.Biostatistics 2000, 1:113–22. 10.1093/biostatistics/1.1.113CrossRefPubMed Greenland S: Small-sample bias and corrections for conditional maximum-likelihood odds-ratio estimators.Biostatistics 2000, 1:113–22. 10.1093/biostatistics/1.1.113CrossRefPubMed
26.
go back to reference Chiburis RC, Das J, Lokshin M: A practical comparison of the bivariate probit and linear IV estimators.Econ Lett 2012, 117:762–6. 10.1016/j.econlet.2012.08.037CrossRef Chiburis RC, Das J, Lokshin M: A practical comparison of the bivariate probit and linear IV estimators.Econ Lett 2012, 117:762–6. 10.1016/j.econlet.2012.08.037CrossRef
27.
go back to reference Butler JS: Estimating the correlation in censored probit models.Rev Econ Stat 1996, 78:356–8. 10.2307/2109940CrossRef Butler JS: Estimating the correlation in censored probit models.Rev Econ Stat 1996, 78:356–8. 10.2307/2109940CrossRef
28.
go back to reference Andrews DW: Estimation when a parameter is on a boundary.Econometrica 1999, 67:1341–83. 10.1111/1468-0262.00082CrossRef Andrews DW: Estimation when a parameter is on a boundary.Econometrica 1999, 67:1341–83. 10.1111/1468-0262.00082CrossRef
29.
go back to reference Andrews DW: Inconsistency of the bootstrap when a parameter is on the boundary of the parameter space.Econometrica 2000, 68:399–405. 10.1111/1468-0262.00114CrossRef Andrews DW: Inconsistency of the bootstrap when a parameter is on the boundary of the parameter space.Econometrica 2000, 68:399–405. 10.1111/1468-0262.00114CrossRef
30.
go back to reference Mundlak Y: On the pooling of time series and cross section data.Econometrica 1978, 46:69–85. 10.2307/1913646CrossRef Mundlak Y: On the pooling of time series and cross section data.Econometrica 1978, 46:69–85. 10.2307/1913646CrossRef
31.
go back to reference Chamberlain G: Analysis of covariance with qualitative data.Rev Econ Stud 1980, 47:225–38. 10.2307/2297110CrossRef Chamberlain G: Analysis of covariance with qualitative data.Rev Econ Stud 1980, 47:225–38. 10.2307/2297110CrossRef
32.
go back to reference Wooldridge JM: Econometric analysis of cross section and panel data. Cambridge, Massachusetts: MIT Press; 2010. Wooldridge JM: Econometric analysis of cross section and panel data. Cambridge, Massachusetts: MIT Press; 2010.
33.
go back to reference Wooldridge JM: Cluster-sample methods in applied econometrics.Am Econ Rev 2003, 93:133–8.CrossRef Wooldridge JM: Cluster-sample methods in applied econometrics.Am Econ Rev 2003, 93:133–8.CrossRef
34.
go back to reference Van der Leeden R, Meijer E, Busing FM: Resampling multilevel models. In Handbook of multilevel analysis. New York: Springer; 2008:401–33.CrossRef Van der Leeden R, Meijer E, Busing FM: Resampling multilevel models. In Handbook of multilevel analysis. New York: Springer; 2008:401–33.CrossRef
35.
go back to reference Qian SS, Stow CA, Borsuk ME: On monte carlo methods for Bayesian inference.Ecol Model 2003, 159:269–77. 10.1016/S0304-3800(02)00299-5CrossRef Qian SS, Stow CA, Borsuk ME: On monte carlo methods for Bayesian inference.Ecol Model 2003, 159:269–77. 10.1016/S0304-3800(02)00299-5CrossRef
36.
go back to reference Hall BL, Hamilton BH: New information technology systems and a Bayesian hierarchical bivariate probit model for profiling surgeon quality at a large hospital.Q Rev Econ Finance 2004, 44:410–29. 10.1016/j.qref.2004.05.004CrossRef Hall BL, Hamilton BH: New information technology systems and a Bayesian hierarchical bivariate probit model for profiling surgeon quality at a large hospital.Q Rev Econ Finance 2004, 44:410–29. 10.1016/j.qref.2004.05.004CrossRef
37.
go back to reference O’Hagan A, Forster J, Kendall MG: Bayesian inference. London: Arnold; 2004. O’Hagan A, Forster J, Kendall MG: Bayesian inference. London: Arnold; 2004.
38.
go back to reference Nyirenda M, Zaba B, Bärnighausen T, Hosegood V, Newell M-L: Adjusting HIV prevalence for survey non-response using mortality rates: an application of the method using surveillance data from rural South Africa.PLoS One 2010, 5:e12370. 10.1371/journal.pone.0012370CrossRefPubMedPubMedCentral Nyirenda M, Zaba B, Bärnighausen T, Hosegood V, Newell M-L: Adjusting HIV prevalence for survey non-response using mortality rates: an application of the method using surveillance data from rural South Africa.PLoS One 2010, 5:e12370. 10.1371/journal.pone.0012370CrossRefPubMedPubMedCentral
39.
go back to reference Terza JV, Tsai W: Censored probit estimation with correlation near the boundary: a useful reparameterization.Rev Appl Econ 2006, 2:1–12. Terza JV, Tsai W: Censored probit estimation with correlation near the boundary: a useful reparameterization.Rev Appl Econ 2006, 2:1–12.
Metadata
Title
Using interviewer random effects to remove selection bias from HIV prevalence estimates
Authors
Mark E McGovern
Till Bärnighausen
Joshua A Salomon
David Canning
Publication date
01-12-2015
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2015
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/1471-2288-15-8

Other articles of this Issue 1/2015

BMC Medical Research Methodology 1/2015 Go to the issue