Skip to main content
Top
Published in: Quality of Life Research 5/2018

Open Access 01-05-2018

Assessing the equivalence of Web-based and paper-and-pencil questionnaires using differential item and test functioning (DIF and DTF) analysis: a case of the Four-Dimensional Symptom Questionnaire (4DSQ)

Authors: Berend Terluin, Evelien P. M. Brouwers, Miquelle A. G. Marchand, Henrica C. W. de Vet

Published in: Quality of Life Research | Issue 5/2018

Login to get access

Abstract

Purpose

Many paper-and-pencil (P&P) questionnaires have been migrated to electronic platforms. Differential item and test functioning (DIF and DTF) analysis constitutes a superior research design to assess measurement equivalence across modes of administration. The purpose of this study was to demonstrate an item response theory (IRT)-based DIF and DTF analysis to assess the measurement equivalence of a Web-based version and the original P&P format of the Four-Dimensional Symptom Questionnaire (4DSQ), measuring distress, depression, anxiety, and somatization.

Methods

The P&P group (n = 2031) and the Web group (n = 958) consisted of primary care psychology clients. Unidimensionality and local independence of the 4DSQ scales were examined using IRT and Yen’s Q3. Bifactor modeling was used to assess the scales’ essential unidimensionality. Measurement equivalence was assessed using IRT-based DIF analysis using a 3-stage approach: linking on the latent mean and variance, selection of anchor items, and DIF testing using the Wald test. DTF was evaluated by comparing expected scale scores as a function of the latent trait.

Results

The 4DSQ scales proved to be essentially unidimensional in both modalities. Five items, belonging to the distress and somatization scales, displayed small amounts of DIF. DTF analysis revealed that the impact of DIF on the scale level was negligible.

Conclusions

IRT-based DIF and DTF analysis is demonstrated as a way to assess the equivalence of Web-based and P&P questionnaire modalities. Data obtained with the Web-based 4DSQ are equivalent to data obtained with the P&P version.
Appendix
Available only for authorised users
Literature
1.
go back to reference van Gelder, M. M., Bretveld, R. W., & Roeleveld, N. (2010). Web-based questionnaires: the future in epidemiology? American Journal of Epidemiology, 172(11), 1292–1298.CrossRefPubMed van Gelder, M. M., Bretveld, R. W., & Roeleveld, N. (2010). Web-based questionnaires: the future in epidemiology? American Journal of Epidemiology, 172(11), 1292–1298.CrossRefPubMed
2.
go back to reference Coons, S. J., Gwaltney, C. J., Hays, R. D., Lundy, J. J., Sloan, J. A., Revicki, D. A., Lenderking, W. R., Cella, D., & Basch, E. & on behalf of the ISPOR ePRO Task Force. (2009). Recommendations on evidence needed to support measurement equivalence between electronic and paper-based patient-reported outcome (PRO) measures: ISPOR ePRO Good Research Practices Task Force report. Value in Health, 12(4), 419–429.CrossRefPubMed Coons, S. J., Gwaltney, C. J., Hays, R. D., Lundy, J. J., Sloan, J. A., Revicki, D. A., Lenderking, W. R., Cella, D., & Basch, E. & on behalf of the ISPOR ePRO Task Force. (2009). Recommendations on evidence needed to support measurement equivalence between electronic and paper-based patient-reported outcome (PRO) measures: ISPOR ePRO Good Research Practices Task Force report. Value in Health, 12(4), 419–429.CrossRefPubMed
3.
go back to reference Gwaltney, C. J., Shields, A. L., & Shiffman, S. (2008). Equivalence of electronic and paper-and-pencil administration of patient-reported outcome measures: A meta-analytic review. Value in Health, 11(2), 322–333.CrossRefPubMed Gwaltney, C. J., Shields, A. L., & Shiffman, S. (2008). Equivalence of electronic and paper-and-pencil administration of patient-reported outcome measures: A meta-analytic review. Value in Health, 11(2), 322–333.CrossRefPubMed
4.
go back to reference Campbell, N., Ali, F., Finlay, A. Y., & Salek, S. S. (2015). Equivalence of electronic and paper-based patient-reported outcome measures. Quality of Life Research, 24(8), 1949–1961.CrossRefPubMed Campbell, N., Ali, F., Finlay, A. Y., & Salek, S. S. (2015). Equivalence of electronic and paper-based patient-reported outcome measures. Quality of Life Research, 24(8), 1949–1961.CrossRefPubMed
5.
go back to reference Muehlhausen, W., Doll, H., Quadri, N., Fordham, B., O’Donohoe, P., Dogar, N., & Wild, D. J. (2015). Equivalence of electronic and paper administration of patient-reported outcome measures: A systematic review and meta-analysis of studies conducted between 2007 and 2013. Health and Quality of Life Outcomes, 13, 167.CrossRefPubMedPubMedCentral Muehlhausen, W., Doll, H., Quadri, N., Fordham, B., O’Donohoe, P., Dogar, N., & Wild, D. J. (2015). Equivalence of electronic and paper administration of patient-reported outcome measures: A systematic review and meta-analysis of studies conducted between 2007 and 2013. Health and Quality of Life Outcomes, 13, 167.CrossRefPubMedPubMedCentral
6.
go back to reference Rutherford, C., Costa, D., Mercieca-Bebber, R., Rice, H., Gabb, L., & King, M. (2016). Mode of administration does not cause bias in patient-reported outcome results: A meta-analysis. Quality of Life Research, 25(3), 559–574.CrossRefPubMed Rutherford, C., Costa, D., Mercieca-Bebber, R., Rice, H., Gabb, L., & King, M. (2016). Mode of administration does not cause bias in patient-reported outcome results: A meta-analysis. Quality of Life Research, 25(3), 559–574.CrossRefPubMed
7.
go back to reference Twiss, J., McKenna, S. P., Graham, J., Swetz, K., Sloan, J., & Gomberg-Maitland, M. (2016). Applying Rasch analysis to evaluate measurement equivalence of different administration formats of the Activity Limitation scale of the Cambridge Pulmonary Hypertension Outcome Review (CAMPHOR). Health and Quality of Life Outcomes, 14, 57.CrossRefPubMedPubMedCentral Twiss, J., McKenna, S. P., Graham, J., Swetz, K., Sloan, J., & Gomberg-Maitland, M. (2016). Applying Rasch analysis to evaluate measurement equivalence of different administration formats of the Activity Limitation scale of the Cambridge Pulmonary Hypertension Outcome Review (CAMPHOR). Health and Quality of Life Outcomes, 14, 57.CrossRefPubMedPubMedCentral
8.
go back to reference Ferrando, P. J., & Lorenzo-Seva, U. (2005). IRT-related factor analytic procedures for testing the equivalence of paper-and-pencil and Internet-administered questionnaires. Psychological Methods, 10(2), 193–205.CrossRefPubMed Ferrando, P. J., & Lorenzo-Seva, U. (2005). IRT-related factor analytic procedures for testing the equivalence of paper-and-pencil and Internet-administered questionnaires. Psychological Methods, 10(2), 193–205.CrossRefPubMed
9.
go back to reference Teresi, J. A., & Fleishman, J. A. (2007). Differential item functioning and health assessment. Quality of Life Research, 16, 33–42.CrossRefPubMed Teresi, J. A., & Fleishman, J. A. (2007). Differential item functioning and health assessment. Quality of Life Research, 16, 33–42.CrossRefPubMed
10.
go back to reference Einarsdóttir, S., & Rounds, J. (2009). Gender bias and construct validity in vocational interest measurement: Differential item functioning in the Strong Interest Inventory. Journal of Vocational Behavior, 74(3), 295–307.CrossRef Einarsdóttir, S., & Rounds, J. (2009). Gender bias and construct validity in vocational interest measurement: Differential item functioning in the Strong Interest Inventory. Journal of Vocational Behavior, 74(3), 295–307.CrossRef
11.
go back to reference Petersen, M. A., Groenvold, M., Bjorner, J. B., Aaronson, N., Conroy, T., Cull, A., Fayers, P., Hjermstad, M., Sprangers, M., & Sullivan, M. & For the European Organisation for Research and Treatment of Cancer Quality of Life Group. (2003). Use of differential item functioning analysis to assess the equivalence of translations of a questionnaire. Quality of Life Research, 12, 373–385.CrossRefPubMed Petersen, M. A., Groenvold, M., Bjorner, J. B., Aaronson, N., Conroy, T., Cull, A., Fayers, P., Hjermstad, M., Sprangers, M., & Sullivan, M. & For the European Organisation for Research and Treatment of Cancer Quality of Life Group. (2003). Use of differential item functioning analysis to assess the equivalence of translations of a questionnaire. Quality of Life Research, 12, 373–385.CrossRefPubMed
12.
go back to reference Zumbo, B. D. (2007). Three generations of DIF analyses: Considering where it has been, where it is now, and where it is going. Language Assessment Quarterly, 4(2), 223–233.CrossRef Zumbo, B. D. (2007). Three generations of DIF analyses: Considering where it has been, where it is now, and where it is going. Language Assessment Quarterly, 4(2), 223–233.CrossRef
13.
go back to reference Donovan, M. A., Drasgow, F., & Probst, T. M. (2000). Does computerizing paper-and-pencil job attitude scales make a difference? New IRT analyses offer insight. Journal of Applied Psychology, 85(2), 305–313.CrossRefPubMed Donovan, M. A., Drasgow, F., & Probst, T. M. (2000). Does computerizing paper-and-pencil job attitude scales make a difference? New IRT analyses offer insight. Journal of Applied Psychology, 85(2), 305–313.CrossRefPubMed
14.
go back to reference Whitaker, B. G., & McKinney, J. L. (2007). Assessing the measurement invariance of latent job satisfaction ratings across survey administration modes for respondent subgroups: A MIMIC modeling approach. Behavior Research Methods, 39(3), 502–509.CrossRefPubMed Whitaker, B. G., & McKinney, J. L. (2007). Assessing the measurement invariance of latent job satisfaction ratings across survey administration modes for respondent subgroups: A MIMIC modeling approach. Behavior Research Methods, 39(3), 502–509.CrossRefPubMed
15.
go back to reference Swartz, R. J., de Moor, C., Cook, K. F., Fouladi, R. T., Basen-Engquist, K., Eng, C., & Taylor, C. C. L. (2007). Mode effects in the center for epidemiologic studies depression (CES-D) scale: personal digital assistant vs. paper and pencil administration. Quality of Life Research, 16(5), 803–813.CrossRefPubMed Swartz, R. J., de Moor, C., Cook, K. F., Fouladi, R. T., Basen-Engquist, K., Eng, C., & Taylor, C. C. L. (2007). Mode effects in the center for epidemiologic studies depression (CES-D) scale: personal digital assistant vs. paper and pencil administration. Quality of Life Research, 16(5), 803–813.CrossRefPubMed
16.
go back to reference Michaelides, M. P. (2008). An illustration of a Mantel-Haenszel procedure to flag misbehaving common items in test equating. Practical Assessment, Research & Evaluation, 13, 7. Michaelides, M. P. (2008). An illustration of a Mantel-Haenszel procedure to flag misbehaving common items in test equating. Practical Assessment, Research & Evaluation, 13, 7.
17.
go back to reference Dorans, N. J., Schmitt, A. P., & Bleistein, C. A. (1992). The standardization approach to assessing comprehensive differential item functioning. Journal of Educational Measurement, 29, 309–319.CrossRef Dorans, N. J., Schmitt, A. P., & Bleistein, C. A. (1992). The standardization approach to assessing comprehensive differential item functioning. Journal of Educational Measurement, 29, 309–319.CrossRef
18.
go back to reference Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and likert-type (ordinal) item scores. Ottawa, ON: Directorate of Human Resources Research and Evaluation, Department of National Defense. Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and likert-type (ordinal) item scores. Ottawa, ON: Directorate of Human Resources Research and Evaluation, Department of National Defense.
19.
go back to reference Tay, L., Meade, A. W., & Cao, M. Y. (2015). An overview and practical guide to IRT measurement equivalence analysis. Organizational Research Methods, 18(1), 3–46.CrossRef Tay, L., Meade, A. W., & Cao, M. Y. (2015). An overview and practical guide to IRT measurement equivalence analysis. Organizational Research Methods, 18(1), 3–46.CrossRef
20.
go back to reference Gregorich, S. E. (2006). Do self-report instruments allow meaningful comparisons across diverse population groups? Testing measurement invariance using the confirmatory factor analysis framework. Medical Care, 44(11), S78-S94.PubMedCentral Gregorich, S. E. (2006). Do self-report instruments allow meaningful comparisons across diverse population groups? Testing measurement invariance using the confirmatory factor analysis framework. Medical Care, 44(11), S78-S94.PubMedCentral
21.
go back to reference Terluin, B., van Marwijk, H. W. J., Adèr, H. J., de Vet, H. C. W., Penninx, B. W. J. H., Hermens, M. L. M., van Boeijen, C. A., van Balkom, A. J. L. M., van der Klink, J. J. L., & Stalman, W. A. B. (2006). The Four-Dimensional Symptom Questionnaire (4DSQ): a validation study of a multidimensional self-report questionnaire to assess distress, depression, anxiety and somatization. BMC Psychiatry, 6, 34.CrossRefPubMedPubMedCentral Terluin, B., van Marwijk, H. W. J., Adèr, H. J., de Vet, H. C. W., Penninx, B. W. J. H., Hermens, M. L. M., van Boeijen, C. A., van Balkom, A. J. L. M., van der Klink, J. J. L., & Stalman, W. A. B. (2006). The Four-Dimensional Symptom Questionnaire (4DSQ): a validation study of a multidimensional self-report questionnaire to assess distress, depression, anxiety and somatization. BMC Psychiatry, 6, 34.CrossRefPubMedPubMedCentral
22.
go back to reference Ridner, S. H. (2004). Psychological distress: concept analysis. Journal of Advanced Nursing, 45(5), 536–545.CrossRefPubMed Ridner, S. H. (2004). Psychological distress: concept analysis. Journal of Advanced Nursing, 45(5), 536–545.CrossRefPubMed
23.
go back to reference Snaith, R. P. (1987). The concepts of mild depression. British Journal of Psychiatry, 150, 387–393.CrossRefPubMed Snaith, R. P. (1987). The concepts of mild depression. British Journal of Psychiatry, 150, 387–393.CrossRefPubMed
24.
go back to reference Beck, A. T., Rush, A. J., Shaw, B. F., & Emery, G. (1979). Cognitive therapy of depression. New York: Guilford Press. Beck, A. T., Rush, A. J., Shaw, B. F., & Emery, G. (1979). Cognitive therapy of depression. New York: Guilford Press.
25.
go back to reference Terluin, B., Brouwers, E. P. M., van Marwijk, H. W. J., Verhaak, P. F. M., & van der Horst, H. E. (2009). Detecting depressive and anxiety disorders in distressed patients in primary care; comparative diagnostic accuracy of the Four-Dimensional Symptom Questionnaire (4DSQ) and the Hospital Anxiety and Depression Scale (HADS). BMC Family Practice, 10, 58.CrossRefPubMedPubMedCentral Terluin, B., Brouwers, E. P. M., van Marwijk, H. W. J., Verhaak, P. F. M., & van der Horst, H. E. (2009). Detecting depressive and anxiety disorders in distressed patients in primary care; comparative diagnostic accuracy of the Four-Dimensional Symptom Questionnaire (4DSQ) and the Hospital Anxiety and Depression Scale (HADS). BMC Family Practice, 10, 58.CrossRefPubMedPubMedCentral
26.
go back to reference van Avendonk, M. J. P., Hassink-Franke, L. J. A., Terluin, B., van Marwijk, H. W. J., Wiersma, T., & Burgers, J. S. (2012). NHG-Standaard Angst (tweede herziening) [Summarisation of the NHG practice guideline ‘Anxiety’]. Nederlands Tijdschrift voor Geneeskunde, 156(34), A4509.PubMed van Avendonk, M. J. P., Hassink-Franke, L. J. A., Terluin, B., van Marwijk, H. W. J., Wiersma, T., & Burgers, J. S. (2012). NHG-Standaard Angst (tweede herziening) [Summarisation of the NHG practice guideline ‘Anxiety’]. Nederlands Tijdschrift voor Geneeskunde, 156(34), A4509.PubMed
27.
go back to reference Terluin, B., Oosterbaan, D. B., Brouwers, E. P. M., van Straten, A., van de Ven, P. M., Langerak, W., & van Marwijk, H. W. J. (2014). To what extent does the anxiety scale of the Four-Dimensional Symptom Questionnaire (4DSQ) detect specific types of anxiety disorder in primary care? A psychometric study. BMC Psychiatry, 14, 121.CrossRefPubMedPubMedCentral Terluin, B., Oosterbaan, D. B., Brouwers, E. P. M., van Straten, A., van de Ven, P. M., Langerak, W., & van Marwijk, H. W. J. (2014). To what extent does the anxiety scale of the Four-Dimensional Symptom Questionnaire (4DSQ) detect specific types of anxiety disorder in primary care? A psychometric study. BMC Psychiatry, 14, 121.CrossRefPubMedPubMedCentral
28.
go back to reference Clarke, D. M., & Smith, G. C. (2000). Somatisation: what is it? Australian Family Physician, 29, 109–113.PubMed Clarke, D. M., & Smith, G. C. (2000). Somatisation: what is it? Australian Family Physician, 29, 109–113.PubMed
29.
go back to reference de Vroege, L., Emons, W. H. M., Sijtsma, K., Hoedeman, R., & van der Feltz-Cornelis, C. M. (2015). Validation of the 4DSQ somatization subscale in the occupational health care setting as a screener. Journal of Occupational Rehabilitation, 25, 105–115.CrossRefPubMed de Vroege, L., Emons, W. H. M., Sijtsma, K., Hoedeman, R., & van der Feltz-Cornelis, C. M. (2015). Validation of the 4DSQ somatization subscale in the occupational health care setting as a screener. Journal of Occupational Rehabilitation, 25, 105–115.CrossRefPubMed
30.
go back to reference Terluin, B., van Rhenen, W., Schaufeli, W. B., & de Haan, M. (2004). The Four-Dimensional Symptom Questionnaire (4DSQ): measuring distress and other mental health problems in a working population. Work Stress, 18, 187–207.CrossRef Terluin, B., van Rhenen, W., Schaufeli, W. B., & de Haan, M. (2004). The Four-Dimensional Symptom Questionnaire (4DSQ): measuring distress and other mental health problems in a working population. Work Stress, 18, 187–207.CrossRef
31.
go back to reference van Ginkel, J. R., & van der Ark, L. A. (2005). SPSS syntax for missing value imputation in test and questionnaire data. Applied Psychological Measurement, 29(2), 152–153.CrossRef van Ginkel, J. R., & van der Ark, L. A. (2005). SPSS syntax for missing value imputation in test and questionnaire data. Applied Psychological Measurement, 29(2), 152–153.CrossRef
32.
go back to reference Sijtsma, K., & van der Ark, L. A. (2003). Investigation and treatment of missing item scores in test and questionnaire data. Multivariate Behavioral Research, 38(4), 505–528.CrossRefPubMed Sijtsma, K., & van der Ark, L. A. (2003). Investigation and treatment of missing item scores in test and questionnaire data. Multivariate Behavioral Research, 38(4), 505–528.CrossRefPubMed
33.
go back to reference Reeve, B. B., Hays, R. D., Bjorner, J. B., Cook, K. F., Crane, P. K., Teresi, J. A., Thissen, D., Revicki, D. A., Weiss, D. J., Hambleton, R. K., Liu, H. H., Gershon, R., Reise, S. P., Lai, J. S., & Cella, D. (2007). Psychometric evaluation and calibration of health-related quality of life item banks - Plans for the patient-reported outcomes measurement information system (PROMIS). Medical Care, 45(5), S22-S31. Reeve, B. B., Hays, R. D., Bjorner, J. B., Cook, K. F., Crane, P. K., Teresi, J. A., Thissen, D., Revicki, D. A., Weiss, D. J., Hambleton, R. K., Liu, H. H., Gershon, R., Reise, S. P., Lai, J. S., & Cella, D. (2007). Psychometric evaluation and calibration of health-related quality of life item banks - Plans for the patient-reported outcomes measurement information system (PROMIS). Medical Care, 45(5), S22-S31.
34.
go back to reference Reise, S. P., Morizot, J., & Hays, R. D. (2007). The role of the bifactor model in resolving dimensionality issues in health outcomes measures. Quality of Life Research, 16(Suppl 1), 19–31.CrossRefPubMed Reise, S. P., Morizot, J., & Hays, R. D. (2007). The role of the bifactor model in resolving dimensionality issues in health outcomes measures. Quality of Life Research, 16(Suppl 1), 19–31.CrossRefPubMed
35.
go back to reference Stout, W. F. (1990). A new item response theory modeling approach with applications to unidimensional assessment and ability estimation. Psychometrika, 55(2), 293–326.CrossRef Stout, W. F. (1990). A new item response theory modeling approach with applications to unidimensional assessment and ability estimation. Psychometrika, 55(2), 293–326.CrossRef
36.
go back to reference Chen, W. H., & Thissen, D. (1997). Local dependence indexes for item pairs: Using item response theory. Journal of Educational and Behavioral Statistics, 22(3), 265–289.CrossRef Chen, W. H., & Thissen, D. (1997). Local dependence indexes for item pairs: Using item response theory. Journal of Educational and Behavioral Statistics, 22(3), 265–289.CrossRef
37.
go back to reference Cai, L., & Hansen, M. (2013). Limited-information goodness-of-fit testing of hierarchical item factor models. British Journal of Mathematical & Statistical Psychology, 66(2), 245–276.CrossRef Cai, L., & Hansen, M. (2013). Limited-information goodness-of-fit testing of hierarchical item factor models. British Journal of Mathematical & Statistical Psychology, 66(2), 245–276.CrossRef
38.
go back to reference Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1–55.CrossRef Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1–55.CrossRef
39.
go back to reference Glöckner-Rist, A., & Hoijtink, H. (2003). The best of both worlds: Factor analysis of dichotomous data using item response theory and structural equation modeling. Structural Equation Modeling, 10(4), 544–565.CrossRef Glöckner-Rist, A., & Hoijtink, H. (2003). The best of both worlds: Factor analysis of dichotomous data using item response theory and structural equation modeling. Structural Equation Modeling, 10(4), 544–565.CrossRef
40.
go back to reference Yen, W. M. (1984). Effects of local item dependence on the fit and equating performance of the 3-parameter logistic model. Applied Psychological Measurement, 8(2), 125–145.CrossRef Yen, W. M. (1984). Effects of local item dependence on the fit and equating performance of the 3-parameter logistic model. Applied Psychological Measurement, 8(2), 125–145.CrossRef
41.
go back to reference Zenisky, A. L., Hambleton, R. K., & Sireci, S. G. (2003). Effects of local item dependence on the validity of IRT item, test, and ability statistics. Washington, DC: Association of American Medical Colleges. Zenisky, A. L., Hambleton, R. K., & Sireci, S. G. (2003). Effects of local item dependence on the validity of IRT item, test, and ability statistics. Washington, DC: Association of American Medical Colleges.
42.
go back to reference Christensen, K. B., Makransky, G., & Horton, M. (2016). Critical values for Yen’s Q3: Identification of local dependence in the Rasch model using residual correlations. Applied Psychological Measurement, 41(3), 178–194.CrossRef Christensen, K. B., Makransky, G., & Horton, M. (2016). Critical values for Yen’s Q3: Identification of local dependence in the Rasch model using residual correlations. Applied Psychological Measurement, 41(3), 178–194.CrossRef
44.
go back to reference Rodriguez, A., Reise, S. P., & Haviland, M. G. (2016). Applying bifactor statistical indices in the evaluation of psychological measures. Journal of Personality Assessment, 98(3), 223–237.CrossRefPubMed Rodriguez, A., Reise, S. P., & Haviland, M. G. (2016). Applying bifactor statistical indices in the evaluation of psychological measures. Journal of Personality Assessment, 98(3), 223–237.CrossRefPubMed
45.
go back to reference Reise, S. P., Scheines, R., Widaman, K. F., & Haviland, M. G. (2012). Multidimensionality and structural coefficient bias in structural equation modeling: A bifactor perspective. Educational and Psychological Measurement, 73(1), 5–26.CrossRef Reise, S. P., Scheines, R., Widaman, K. F., & Haviland, M. G. (2012). Multidimensionality and structural coefficient bias in structural equation modeling: A bifactor perspective. Educational and Psychological Measurement, 73(1), 5–26.CrossRef
46.
go back to reference Bonifay, W. E., Reise, S. P., Scheines, R., & Meijer, R. R. (2015). When are multidimensional data unidimensional enough for structural equation modeling? An evaluation of the DETECT multidimensionality index. Structural Equation Modeling: A Multidisciplinary Journal, 22(4), 504–516.CrossRef Bonifay, W. E., Reise, S. P., Scheines, R., & Meijer, R. R. (2015). When are multidimensional data unidimensional enough for structural equation modeling? An evaluation of the DETECT multidimensionality index. Structural Equation Modeling: A Multidisciplinary Journal, 22(4), 504–516.CrossRef
47.
go back to reference Cohen, J. (1977). Statistical power analysis for the behavioral sciences. New York: Academic Press. Cohen, J. (1977). Statistical power analysis for the behavioral sciences. New York: Academic Press.
48.
go back to reference Hays, R. D., Morales, L. S., & Reise, S. P. (2000). Item response theory and health outcomes measurement in the 21st century. Medical Care, 38(9), 28–42. Hays, R. D., Morales, L. S., & Reise, S. P. (2000). Item response theory and health outcomes measurement in the 21st century. Medical Care, 38(9), 28–42.
49.
go back to reference Nguyen, T. H., Han, H. R., Kim, M. T., & Chan, K. S. (2014). An introduction to item response theory for patient-reported outcome measurement. Patient, 7(1), 23–35.CrossRefPubMedPubMedCentral Nguyen, T. H., Han, H. R., Kim, M. T., & Chan, K. S. (2014). An introduction to item response theory for patient-reported outcome measurement. Patient, 7(1), 23–35.CrossRefPubMedPubMedCentral
50.
go back to reference Langer, M. M. (2008). A reexamination of Lord’s Wald test for differential item functioning using item response theory and modern error estimation. Chapel Hill: University of North Carolina. Langer, M. M. (2008). A reexamination of Lord’s Wald test for differential item functioning using item response theory and modern error estimation. Chapel Hill: University of North Carolina.
51.
go back to reference Woods, C. M., Cai, L., & Wang, M. (2012). The Langer-improved Wald test for DIF testing with multiple groups. Educational and Psychological Measurement, 73(3), 532–547.CrossRef Woods, C. M., Cai, L., & Wang, M. (2012). The Langer-improved Wald test for DIF testing with multiple groups. Educational and Psychological Measurement, 73(3), 532–547.CrossRef
52.
go back to reference Meade, A. W. (2010). A taxonomy of measurement invariance effect size indices. Journal of Applied Psychology, 95(4), 728–743.CrossRefPubMed Meade, A. W. (2010). A taxonomy of measurement invariance effect size indices. Journal of Applied Psychology, 95(4), 728–743.CrossRefPubMed
53.
go back to reference Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29.CrossRef Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29.CrossRef
55.
go back to reference R Core Team. (2016). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. R Core Team. (2016). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.
Metadata
Title
Assessing the equivalence of Web-based and paper-and-pencil questionnaires using differential item and test functioning (DIF and DTF) analysis: a case of the Four-Dimensional Symptom Questionnaire (4DSQ)
Authors
Berend Terluin
Evelien P. M. Brouwers
Miquelle A. G. Marchand
Henrica C. W. de Vet
Publication date
01-05-2018
Publisher
Springer International Publishing
Published in
Quality of Life Research / Issue 5/2018
Print ISSN: 0962-9343
Electronic ISSN: 1573-2649
DOI
https://doi.org/10.1007/s11136-018-1816-5

Other articles of this Issue 5/2018

Quality of Life Research 5/2018 Go to the issue