Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2024

Open Access 01-12-2024 | Research

Health estimate differences between six independent web surveys: different web surveys, different results?

Authors: Rainer Schnell, Jonas Klingwort

Published in: BMC Medical Research Methodology | Issue 1/2024

Login to get access

Abstract

Most general population web surveys are based on online panels maintained by commercial survey agencies. Many of these panels are based on non-probability samples. However, survey agencies differ in their panel selection and management strategies. Little is known if these different strategies cause differences in survey estimates. This paper presents the results of a systematic study designed to analyze the differences in web survey results between agencies. Six different survey agencies were commissioned with the same web survey using an identical standardized questionnaire covering factual health items. Five surveys were fielded at the same time. A calibration approach was used to control the effect of demographics on the outcome. Overall, the results show differences between probability and non-probability surveys in health estimates, which were reduced but not eliminated by weighting. Furthermore, the differences between non-probability surveys before and after weighting are larger than expected between random samples from the same population.
Footnotes
1
NPS-1 is operated by GMI (now part of Kantar), NPS-2 is operated by SSI (now named Dynata), NPS-3 is operated by Ipsos, NPS-4 is the WiSo-panel [39], PS-1 is operated by Forsa, and PS-2 is the GESIS panel.
 
2
Therefore, many different panel management strategies could impact differences between agencies, for example, recruitment, payment, control and web interface. Furthermore, providers may have different panel attrition problems or suffer from different panel conditioning effects. Separating these effects could form a research program on its own.
 
3
The data would have been available in a closely supervised research data center, but initially, PS-2 was not able to grant access within six months to the research data center. Later, Covid-19 restrictions delayed access to the research data center.
 
4
Since two heterogeneous kinds of samples have to be compared, we have no meta-analysis problem, which excludes standard measures of heterogeneity. Therefore, we use multiple pairwise comparisons (Tukey’s HSD) between the weighted means of surveys.
 
5
Comparing p-values with a fixed threshold is rarely advised [47]. We use t-tests here as rough indicators for differences larger than expected, not to make decisions about a hypothesis. However, the effect measure Cohen d is related to t: \((|t|=\sqrt{\left( n_1 n_2\right) /\left( n_1+n_2\right) } d)\) [48]. The factor for multiplying d to yield t is about 38.7 and 50 for all comparisons. Due to this monotonic transformation, an analysis based on d would, therefore, yield comparable results. To help interpreting the results, we additionally report effect sizes using Cohen’s d.
 
6
Age was used with six categories (18–24, 25–29, 30–39, 40–49, 50–64, and 65+), gender with two categories, education with five categories, size of the municipality with three categories (10.000–20.000, 20.001–100.000, 100.000 and more inhabitants) and region with 16 categories (the German federal states). The GREG weighting model can be written as age \(*\) gender \(*\) education \(*\) size of municipality \(*\) federal state.
 
7
Between 0.4% (NPS-2) and 8.3% (PS-1) respondents did not answer at least one question on demography.
 
8
During the weight computations, empty cells in the weighting model were replaced with one pseudo-observation for each missing cell. The number of created pseudo-observations per survey were 1.566 for NPS-1, 1.628 for NPS-2, 1.505 for NPS-3, 1.965 for NPS-4, 1.714 for PS-1, and 1.839 for PS-2. After calculating the weights, the pseudo-observations were removed from the data set.
 
9
Effect sizes of mode differences are rarely published in survey methodology. However, [56] reports 0.04 as the mean of Cohen’s d for 138 items compared between a face-to-face survey and a mixed-mode survey. Compared to these values, the mean effects of NPS vs PS are larger.
 
Literature
8.
go back to reference Blair J, Czaja R, Blair EA. Designing Surveys: A Guide to Decisions and Procedures. 3rd ed. Thousand Oaks: Sage; 2014.CrossRef Blair J, Czaja R, Blair EA. Designing Surveys: A Guide to Decisions and Procedures. 3rd ed. Thousand Oaks: Sage; 2014.CrossRef
14.
go back to reference Bethlehem J. Web Surveys in Official Statistics. In: Engel U, Jann B, Lynn P, Scherpenzeel A, Sturgis P, editors. Improving Survey Methods: Lessons from Recent Research. New York: Routledge; 2015. p. 156–69. Bethlehem J. Web Surveys in Official Statistics. In: Engel U, Jann B, Lynn P, Scherpenzeel A, Sturgis P, editors. Improving Survey Methods: Lessons from Recent Research. New York: Routledge; 2015. p. 156–69.
15.
go back to reference Leenheer J, Scherpenzeel AC. Does It Pay Off to Include Non-Internet Households in an Internet Panel? Int J Internet Sci. 2013;8(1):17–29. Leenheer J, Scherpenzeel AC. Does It Pay Off to Include Non-Internet Households in an Internet Panel? Int J Internet Sci. 2013;8(1):17–29.
16.
go back to reference Blom AG, Herzing JME, Cornesse C, Sakshaug JW, Krieger U, Bossert D. Does the Recruitment of Offline Households Increase the Sample Representativeness of Probability-Based Online Panels? Evidence From the German Internet Panel. Soc Sci Comput Rev. 2016;35(4):498–520. https://doi.org/10.1177/0894439316651584.CrossRef Blom AG, Herzing JME, Cornesse C, Sakshaug JW, Krieger U, Bossert D. Does the Recruitment of Offline Households Increase the Sample Representativeness of Probability-Based Online Panels? Evidence From the German Internet Panel. Soc Sci Comput Rev. 2016;35(4):498–520. https://​doi.​org/​10.​1177/​0894439316651584​.CrossRef
21.
go back to reference Bethlehem J, Biffignandi S. Handbook of Web Surveys. Hoboken: Wiley; 2012. Bethlehem J, Biffignandi S. Handbook of Web Surveys. Hoboken: Wiley; 2012.
24.
go back to reference Czajka JL, Beyler A. Declining Response Rates in Federal Surveys: Trends and Implications (Background Paper). 2016. Technical Report Final Report – Volume I, Mathematica Policy Research. Czajka JL, Beyler A. Declining Response Rates in Federal Surveys: Trends and Implications (Background Paper). 2016. Technical Report Final Report – Volume I, Mathematica Policy Research.
26.
go back to reference de Leeuw E, Hox J, Luiten A. International Nonresponse Trends Across Countries and Years: An Analysis of 36 Years of Labour Force Survey Data. Surv Insights Methods Field. 2018;1–11. de Leeuw E, Hox J, Luiten A. International Nonresponse Trends Across Countries and Years: An Analysis of 36 Years of Labour Force Survey Data. Surv Insights Methods Field. 2018;1–11.
30.
go back to reference Tourangeau R, Conrad FG, Couper MP. The Science of Web Surveys. New York: Oxford University Press; 2013.CrossRef Tourangeau R, Conrad FG, Couper MP. The Science of Web Surveys. New York: Oxford University Press; 2013.CrossRef
32.
go back to reference Braekman E, Charafeddine R, Demarest S, Drieskens S, Berete F, Gisle L, et al. Comparing Web-based Versus Face-to-face and Paper-and-pencil Questionnaire Data Collected Through Two Belgian Health Surveys. Int J Publ Health. 2020;1–12. https://doi.org/10.1007/s00038-019-01327-9. Braekman E, Charafeddine R, Demarest S, Drieskens S, Berete F, Gisle L, et al. Comparing Web-based Versus Face-to-face and Paper-and-pencil Questionnaire Data Collected Through Two Belgian Health Surveys. Int J Publ Health. 2020;1–12. https://​doi.​org/​10.​1007/​s00038-019-01327-9.
35.
go back to reference Zhou XH, Zhou C, Liu D, Ding X. Applied Missing Data Analysis in the Health Sciences. Hoboken: Wiley; 2014. Zhou XH, Zhou C, Liu D, Ding X. Applied Missing Data Analysis in the Health Sciences. Hoboken: Wiley; 2014.
36.
go back to reference Little RJA, Rubin DB. Statistical Analysis with Missing Data. 3rd ed. Hoboken: Wiley; 2020. Little RJA, Rubin DB. Statistical Analysis with Missing Data. 3rd ed. Hoboken: Wiley; 2020.
37.
go back to reference Särndal CE, Swensson B, Wretman J. Model Assisted Survey Sampling. New York: Springer; 1992.CrossRef Särndal CE, Swensson B, Wretman J. Model Assisted Survey Sampling. New York: Springer; 1992.CrossRef
38.
go back to reference Särndal CE, Lundström S. Estimation in Surveys with Nonresponse. Chichester: Wiley; 2005.CrossRef Särndal CE, Lundström S. Estimation in Surveys with Nonresponse. Chichester: Wiley; 2005.CrossRef
39.
go back to reference Göritz A. Determinants of the Starting Rate and the Completion Rate in Online Panel Studies. In: Callegaro M, Baker RP, Bethlehem J, Göritz A, Krosnick JA, Lavrakas PJ, editors. Online Panel Research: A Data Quality Perspective. Hoboken: Wiley; 2014. p. 154–70.CrossRef Göritz A. Determinants of the Starting Rate and the Completion Rate in Online Panel Studies. In: Callegaro M, Baker RP, Bethlehem J, Göritz A, Krosnick JA, Lavrakas PJ, editors. Online Panel Research: A Data Quality Perspective. Hoboken: Wiley; 2014. p. 154–70.CrossRef
40.
go back to reference Güllner M, Schmitt LH. Innovation in der Markt- und Sozialforschung: das forsa.omninet-Panel [Innovations in market research: The fosa.omninet-panel, in German]. Sozialwissenschaften Berufspraxis. 2004;27(1):11–22. Güllner M, Schmitt LH. Innovation in der Markt- und Sozialforschung: das forsa.omninet-Panel [Innovations in market research: The fosa.omninet-panel, in German]. Sozialwissenschaften Berufspraxis. 2004;27(1):11–22.
41.
go back to reference Gößwald A, Lange M, Dölle R, Hölling H. Die erste Welle der Studie zur Gesundheit Erwachsener in Deutschland (DEGS1): Gewinnung von Studienteilnehmenden, Durchführung der Feldarbeit und Qualitätsmanagement [The First Wave of the Study of Adult Health in Germany (DEGS1): Recruitment of Study Participants, Fieldwork Implementation, and Quality Management, in German]. Bundesgesundheitsbl Gesundheitsforsch Gesundheitsschutz. 2013;56(5). https://doi.org/10.1007/s00103-013-1671-z. Gößwald A, Lange M, Dölle R, Hölling H. Die erste Welle der Studie zur Gesundheit Erwachsener in Deutschland (DEGS1): Gewinnung von Studienteilnehmenden, Durchführung der Feldarbeit und Qualitätsmanagement [The First Wave of the Study of Adult Health in Germany (DEGS1): Recruitment of Study Participants, Fieldwork Implementation, and Quality Management, in German]. Bundesgesundheitsbl Gesundheitsforsch Gesundheitsschutz. 2013;56(5). https://​doi.​org/​10.​1007/​s00103-013-1671-z.
42.
go back to reference Kamtsiuris P, Lange M, Hoffmann R, Rosario AS, Dahm S, Kuhnert R, et al. Die erste Welle der Studie zur Gesundheit Erwachsener in Deutschland (DEGS1): Stichprobendesign, Response, Gewichtung und Repräsentativität [The first wave of the Study of Adult Health in Germany (DEGS1): sampling design, response, weighting, and representativeness., in German]. Bundesgesundheitsbl Gesundheitsforsch Gesundheitsschutz. 2013;56(5). https://doi.org/10.1007/s00103-012-1650-9. Kamtsiuris P, Lange M, Hoffmann R, Rosario AS, Dahm S, Kuhnert R, et al. Die erste Welle der Studie zur Gesundheit Erwachsener in Deutschland (DEGS1): Stichprobendesign, Response, Gewichtung und Repräsentativität [The first wave of the Study of Adult Health in Germany (DEGS1): sampling design, response, weighting, and representativeness., in German]. Bundesgesundheitsbl Gesundheitsforsch Gesundheitsschutz. 2013;56(5). https://​doi.​org/​10.​1007/​s00103-012-1650-9.
43.
go back to reference RKI. Beiträge zur Gesundheitsberichterstattung des Bundes - Daten und Fakten: Ergebnisse der Studie Gesundheit in Deutschland aktuell 2012 [Contributions to federal health reporting – Facts and figures: Results of the study on current health in Germany 2012, in German]. Abteilung für Epidemiologie und Gesundheitsmonitoring. Berlin: Robert Koch-Institut; 2014. RKI. Beiträge zur Gesundheitsberichterstattung des Bundes - Daten und Fakten: Ergebnisse der Studie Gesundheit in Deutschland aktuell 2012 [Contributions to federal health reporting – Facts and figures: Results of the study on current health in Germany 2012, in German]. Abteilung für Epidemiologie und Gesundheitsmonitoring. Berlin: Robert Koch-Institut; 2014.
44.
go back to reference Saß AC, Lange C, Finger JD, Allen J, Born S, Hoebel J, et al. Supplement: Fragebogen zur Studie ‘Gesundheit in Deutschland aktuell’: GEDA 2014/2015-EHIS [Supplement: Questionnaire for the study ‘Current Health in Germany’: GEDA 2014/2015-EHIS, in German]. J Health Monit. 2017;2(1):106–34. Saß AC, Lange C, Finger JD, Allen J, Born S, Hoebel J, et al. Supplement: Fragebogen zur Studie ‘Gesundheit in Deutschland aktuell’: GEDA 2014/2015-EHIS [Supplement: Questionnaire for the study ‘Current Health in Germany’: GEDA 2014/2015-EHIS, in German]. J Health Monit. 2017;2(1):106–34.
45.
go back to reference Forschungsdatenzentrum ALLBUS. ALLBUS 2014 Fragebogendokumentation: Material zu den Datensätzen der Studiennummern ZA5240 und ZA5241 [ALLBUS 2014 questionnaire documentation: material on the data sets of study numbers ZA5240 and ZA5241, in German]. 2014. Forschungsdatenzentrum ALLBUS. ALLBUS 2014 Fragebogendokumentation: Material zu den Datensätzen der Studiennummern ZA5240 und ZA5241 [ALLBUS 2014 questionnaire documentation: material on the data sets of study numbers ZA5240 and ZA5241, in German]. 2014.
46.
go back to reference Destatis. Statistik und Wissenschaft: Demographische Standards Ausgabe 2010 [Statistics and Science: Demographic Standards Edition 2010, in German]. Wiesbaden; 2010. Destatis. Statistik und Wissenschaft: Demographische Standards Ausgabe 2010 [Statistics and Science: Demographic Standards Edition 2010, in German]. Wiesbaden; 2010.
49.
go back to reference Bickel DR. Genomics Data Analysis: False Discovery Rates and Empirical Bayes Methods. Boca Raton: CRC Press; 2020. Bickel DR. Genomics Data Analysis: False Discovery Rates and Empirical Bayes Methods. Boca Raton: CRC Press; 2020.
50.
go back to reference Callegaro M, Manfreda KL, Vehovar V. Web Survey Methodology. Los Angeles: Sage; 2015.CrossRef Callegaro M, Manfreda KL, Vehovar V. Web Survey Methodology. Los Angeles: Sage; 2015.CrossRef
52.
go back to reference Potter F, Zheng Y. Methods and Issues in Trimming Extreme Weights in Sample Surveys. JSM Proc Surv Res Methods Sect. 2015;2707–2719. Potter F, Zheng Y. Methods and Issues in Trimming Extreme Weights in Sample Surveys. JSM Proc Surv Res Methods Sect. 2015;2707–2719.
53.
go back to reference Chen Q, Elliott MR, Haziza D, Yang Y, Ghosh M, Little RJA, et al. Approaches to Improving Survey-Weighted Estimates. Stat Sci. 2017;32(2):227–48.CrossRef Chen Q, Elliott MR, Haziza D, Yang Y, Ghosh M, Little RJA, et al. Approaches to Improving Survey-Weighted Estimates. Stat Sci. 2017;32(2):227–48.CrossRef
55.
go back to reference Cohen J. Statistical Power Analysis for the Behavioral Sciences. 2nd ed. Hillsdale: Erlbaum; 1988. Cohen J. Statistical Power Analysis for the Behavioral Sciences. 2nd ed. Hillsdale: Erlbaum; 1988.
59.
go back to reference DiSogra C, Cobb C, Chan E, Dennis JM. Calibrating Non-Probability Internet Samples with Probability Samples Using Early Adopter Characteristics. In: Proceedings of Joint Statistical Meetings (JSM). Alexandria: American Statistical Association, Section on Survey Research Methods; 2011. p. 4501–4515. DiSogra C, Cobb C, Chan E, Dennis JM. Calibrating Non-Probability Internet Samples with Probability Samples Using Early Adopter Characteristics. In: Proceedings of Joint Statistical Meetings (JSM). Alexandria: American Statistical Association, Section on Survey Research Methods; 2011. p. 4501–4515.
60.
go back to reference Gelman A, Little TC. Poststratification into many categories using hierarchical logistic regression. Surv Methodol. 1997;23(2):127–35. Gelman A, Little TC. Poststratification into many categories using hierarchical logistic regression. Surv Methodol. 1997;23(2):127–35.
61.
go back to reference Rosenbaum PR, Rubin DB. The Central Role of the Propensity Score in Observational Studies for Causal Effects on JSTOR. Biometrika. 1983;70(1):41–55.CrossRef Rosenbaum PR, Rubin DB. The Central Role of the Propensity Score in Observational Studies for Causal Effects on JSTOR. Biometrika. 1983;70(1):41–55.CrossRef
Metadata
Title
Health estimate differences between six independent web surveys: different web surveys, different results?
Authors
Rainer Schnell
Jonas Klingwort
Publication date
01-12-2024
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2024
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/s12874-023-02122-0

Other articles of this Issue 1/2024

BMC Medical Research Methodology 1/2024 Go to the issue