Skip to main content
Top
Published in: Quality of Life Research 2/2008

Open Access 01-03-2008

Nonparametric IRT analysis of Quality-of-Life Scales and its application to the World Health Organization Quality-of-Life Scale (WHOQOL-Bref)

Authors: Klaas Sijtsma, Wilco H. M. Emons, Samantha Bouwmeester, Ivan Nyklíček, Leo D. Roorda

Published in: Quality of Life Research | Issue 2/2008

Login to get access

Abstract

Background

This study investigates the usefulness of the nonparametric monotone homogeneity model for evaluating and constructing Health-Related Quality-of-Life Scales consisting of polytomous items, and compares it to the often-used parametric graded response model.

Methods

The nonparametric monotone homogeneity model is a general model of which all known parametric models for polytomous items are special cases. Merits, drawbacks, and possibilities of nonparametric and parametric models and available software are discussed. Particular attention is given to the monotone homogeneity model (also known as the Mokken model), and the often-used parametric graded response model.

Results

Data from the WHOQOL-Bref were analyzed using both the monotone homogeneity model and the graded response model. The monotone homogeneity model analysis yielded unidimensional scales for each content domain. Scalability coefficients further showed that some items have limited scalability with respect to the other items in the same scale. The parametric IRT analyses lead to the rejection of some of the items.

Conclusions

The nonparametric monotone homogeneity model is highly suited for data analysis in a health-related quality-of-life context, and the parametric graded response model may add interesting features to measurement provided the model fits the data well.
Literature
1.
go back to reference Hambleton, R. K., & Swaminathan, H. (1985). Item response theory. Principles and applications. Boston: Kluwer Nijhoff. Hambleton, R. K., & Swaminathan, H. (1985). Item response theory. Principles and applications. Boston: Kluwer Nijhoff.
2.
go back to reference Mokken, R. J. (1971). A theory and procedure of scale analysis. Berlin: De Gruyter. Mokken, R. J. (1971). A theory and procedure of scale analysis. Berlin: De Gruyter.
3.
go back to reference Mokken, R. J. (1997). Nonparametric models for dichotomous responses. In W. J. v. d. Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 351–367). New York: Springer. Mokken, R. J. (1997). Nonparametric models for dichotomous responses. In W. J. v. d. Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 351–367). New York: Springer.
4.
go back to reference Sijtsma, K., & Molenaar, I. W. (2002). Introduction to nonparametric item response theory. Thousand Oaks, CA: Sage. Sijtsma, K., & Molenaar, I. W. (2002). Introduction to nonparametric item response theory. Thousand Oaks, CA: Sage.
5.
go back to reference Petersen, M. A. (2004). Book review: Introduction to nonparametric iterm response theory. Quality of Life Research, 14, 1201–1202.CrossRef Petersen, M. A. (2004). Book review: Introduction to nonparametric iterm response theory. Quality of Life Research, 14, 1201–1202.CrossRef
6.
go back to reference Ringdal, K., Ringdal, G. I., Kaasa, S., Bjordal, K., Wisløff, F., Sundstrøm, S., & Hermstad, M.J. (1999). Assessing the consistency of psychometric properties of the HRQOL scales within the EORTC QLC-C30 across populations by means of the Mokken scaling model. Quality of Life Research, 8, 25–41.PubMedCrossRef Ringdal, K., Ringdal, G. I., Kaasa, S., Bjordal, K., Wisløff, F., Sundstrøm, S., & Hermstad, M.J. (1999). Assessing the consistency of psychometric properties of the HRQOL scales within the EORTC QLC-C30 across populations by means of the Mokken scaling model. Quality of Life Research, 8, 25–41.PubMedCrossRef
7.
go back to reference Moorer, P., Suurmeijer, Th. P. B. M., Foets, M., & Molenaar, I. W. (2001). Psychometric properties of the RAND-36 among three chronic diseases (multiple sclerosis, rheumatic diseases and COPD) in the Netherlands. Quality of Life Research, 10, 637–645.PubMedCrossRef Moorer, P., Suurmeijer, Th. P. B. M., Foets, M., & Molenaar, I. W. (2001). Psychometric properties of the RAND-36 among three chronic diseases (multiple sclerosis, rheumatic diseases and COPD) in the Netherlands. Quality of Life Research, 10, 637–645.PubMedCrossRef
8.
go back to reference Van der Heijden, P. G. M., Van Buuren, S., Fekkes, M., Radder, J., & Verrips, E. (2003). Unidimensionality and reliability under Mokken scaling of the dutch language version of the SF-36. Quality of Life Research, 12, 189–198.PubMedCrossRef Van der Heijden, P. G. M., Van Buuren, S., Fekkes, M., Radder, J., & Verrips, E. (2003). Unidimensionality and reliability under Mokken scaling of the dutch language version of the SF-36. Quality of Life Research, 12, 189–198.PubMedCrossRef
9.
go back to reference Roorda, L. D., Roebroeck, M. E., Van Tilburg, T., Molenaar, I. W., Lankhorst, G. J., Bouter, L.M., & the Measuring Mobility Studying Group (2005). Measuring activity limitations in walking: Development of a hierarchical scale for patients with lower-extremity disorders who live at home. Archives of Physical Medicine and Rehabilitation, 86, 2277–2283.PubMedCrossRef Roorda, L. D., Roebroeck, M. E., Van Tilburg, T., Molenaar, I. W., Lankhorst, G. J., Bouter, L.M., & the Measuring Mobility Studying Group (2005). Measuring activity limitations in walking: Development of a hierarchical scale for patients with lower-extremity disorders who live at home. Archives of Physical Medicine and Rehabilitation, 86, 2277–2283.PubMedCrossRef
10.
go back to reference Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: Nielsen & Lydische. Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: Nielsen & Lydische.
11.
go back to reference Muraki, E. (1997). A generalized partial credit model. In W. J. v. d. Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 153–164). New York: Springer. Muraki, E. (1997). A generalized partial credit model. In W. J. v. d. Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 153–164). New York: Springer.
12.
go back to reference Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561–573.CrossRef Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561–573.CrossRef
13.
go back to reference Barley, E. A., & Jones, P. W. (2006). Repeatability of a Rasch model of the AQ20 over five assessments. Quality of Life Research, 15, 801–809.PubMedCrossRef Barley, E. A., & Jones, P. W. (2006). Repeatability of a Rasch model of the AQ20 over five assessments. Quality of Life Research, 15, 801–809.PubMedCrossRef
14.
go back to reference Fitzpatrick, R., Norquist, J. M., Jenkinson, C., Reeves, B. C., Morris, R. W., Murray, D. W., & Gregg, P. J. (2004). A comparison of Rasch with likert scoring to discriminate between patients’ evaluations of total hip replacement surgery. Quality of Life Research, 13, 331–338.PubMedCrossRef Fitzpatrick, R., Norquist, J. M., Jenkinson, C., Reeves, B. C., Morris, R. W., Murray, D. W., & Gregg, P. J. (2004). A comparison of Rasch with likert scoring to discriminate between patients’ evaluations of total hip replacement surgery. Quality of Life Research, 13, 331–338.PubMedCrossRef
15.
go back to reference Kosinski, M., Bjorner, J. B., Ware, J. E. Jr, Batenhorst, A., & Cady, R. K. (2003). The responsiveness of headache impact scales scored using ‘classical’ and ‘modern’ psychometric methods: A re-analysis of three clinical trials. Quality of Life Research, 12, 903–912. Kosinski, M., Bjorner, J. B., Ware, J. E. Jr, Batenhorst, A., & Cady, R. K. (2003). The responsiveness of headache impact scales scored using ‘classical’ and ‘modern’ psychometric methods: A re-analysis of three clinical trials. Quality of Life Research, 12, 903–912.
16.
go back to reference Junker, B. W., & Sijtsma, K. (2001). Nonparametric item response theory in action: An overview of the special issue. Applied Psychological Measurement, 25, 211–220.CrossRef Junker, B. W., & Sijtsma, K. (2001). Nonparametric item response theory in action: An overview of the special issue. Applied Psychological Measurement, 25, 211–220.CrossRef
17.
go back to reference Stout, W. F. (2002). Psychometrics: From practice to theory and back. Psychometrika, 67, 485–518.CrossRef Stout, W. F. (2002). Psychometrics: From practice to theory and back. Psychometrika, 67, 485–518.CrossRef
18.
go back to reference Sijtsma, K., & Meijer, R. R. (2007). Nonparametric item response theory and related topics. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics, vol. 26: Psychometrics (pp. 719–746). Amsterdam: Elsevier. Sijtsma, K., & Meijer, R. R. (2007). Nonparametric item response theory and related topics. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics, vol. 26: Psychometrics (pp. 719–746). Amsterdam: Elsevier.
19.
go back to reference The WHOQoL Group (1998). Development of the World Health Organisation WHOQOL-Bref QoL assessment. Psychological Medicine, 28, 551–559.CrossRef The WHOQoL Group (1998). Development of the World Health Organisation WHOQOL-Bref QoL assessment. Psychological Medicine, 28, 551–559.CrossRef
20.
go back to reference Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley. Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
21.
go back to reference Reckase, M. D. (1997). A linear logistic multidimensional model for dichotomous item response data. In W. J. v. d. Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 271–286). New York: Springer. Reckase, M. D. (1997). A linear logistic multidimensional model for dichotomous item response data. In W. J. v. d. Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 271–286). New York: Springer.
22.
go back to reference Mellenbergh, G. J. (1995). Conceptual notes on models for discrete polytomous item responses. Applied Psychological Measurement, 19, 91–100.CrossRef Mellenbergh, G. J. (1995). Conceptual notes on models for discrete polytomous item responses. Applied Psychological Measurement, 19, 91–100.CrossRef
23.
go back to reference Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika, Monograph supplement No. 17. Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika, Monograph supplement No. 17.
24.
go back to reference Hemker, B. T., Sijtsma, K., Molenaar, I. W., & Junker, B. W. (1997). Stochastic ordering using the latent trait and the sum score in polytomous IRT models. Psychometrika, 62, 331–347.CrossRef Hemker, B. T., Sijtsma, K., Molenaar, I. W., & Junker, B. W. (1997). Stochastic ordering using the latent trait and the sum score in polytomous IRT models. Psychometrika, 62, 331–347.CrossRef
25.
go back to reference Van Engelenburg, G. (1997). On psychometric models for polytomous items with ordered categories within the framework of item response theory. Ph.D. Thesis, Amsterdam, The Netherlands: University of Amsterdam. Van Engelenburg, G. (1997). On psychometric models for polytomous items with ordered categories within the framework of item response theory. Ph.D. Thesis, Amsterdam, The Netherlands: University of Amsterdam.
26.
go back to reference Samejima, F. (1997). Graded response model. In W. J. v. d. Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 85–100). New York: Springer. Samejima, F. (1997). Graded response model. In W. J. v. d. Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 85–100). New York: Springer.
27.
go back to reference Molenaar, I. W. (1997). Nonparametric models for polytomous responses. In W. J. v. d. Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory. (pp. 369–380). New York: Springer. Molenaar, I. W. (1997). Nonparametric models for polytomous responses. In W. J. v. d. Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory. (pp. 369–380). New York: Springer.
28.
go back to reference Van der Ark, L. A. (2005). Stochastic ordering of the latent trait by the sum score under various polytomous IRT models. Psychometrika, 70, 283–304.CrossRef Van der Ark, L. A. (2005). Stochastic ordering of the latent trait by the sum score under various polytomous IRT models. Psychometrika, 70, 283–304.CrossRef
29.
go back to reference Junker, B. W., & Sijtsma, K. (2000). Latent and manifest monotonicity in item response models. Applied Psychological Measurement, 24, 65–81.CrossRef Junker, B. W., & Sijtsma, K. (2000). Latent and manifest monotonicity in item response models. Applied Psychological Measurement, 24, 65–81.CrossRef
30.
go back to reference Mokken, R. J., Lewis, C., & Sijtsma, K. (1986). Rejoinder to ‘The Mokken scale: A critical discussion’. Applied Psychological Measurement, 10, 279–285.CrossRef Mokken, R. J., Lewis, C., & Sijtsma, K. (1986). Rejoinder to ‘The Mokken scale: A critical discussion’. Applied Psychological Measurement, 10, 279–285.CrossRef
31.
go back to reference Fan, X. (1998). Item response theory and classical test theory: An empirical comparison of their item/person statistics. Educational and Psychological Measurement, 58, 357–382.CrossRef Fan, X. (1998). Item response theory and classical test theory: An empirical comparison of their item/person statistics. Educational and Psychological Measurement, 58, 357–382.CrossRef
32.
go back to reference Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum. Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum.
33.
go back to reference Molenaar, I. W., & Sijtsma, K. (2000). User’s manual MSP5 for Windows. Groningen, the Netherlands: iecPROGAMMA. Molenaar, I. W., & Sijtsma, K. (2000). User’s manual MSP5 for Windows. Groningen, the Netherlands: iecPROGAMMA.
34.
go back to reference Ramsay, J. O. (2000). Testgraf. A program for the analysis of multiple choice test and questionnaire data. Montreal, Canada: Department of Psychology, McGill University. Ramsay, J. O. (2000). Testgraf. A program for the analysis of multiple choice test and questionnaire data. Montreal, Canada: Department of Psychology, McGill University.
35.
go back to reference Ramsay, J. O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56, 611–630.CrossRef Ramsay, J. O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56, 611–630.CrossRef
36.
go back to reference Fox, J. (1997). Applied regression analysis, linear models, and related methods. Thousand Oaks, CA: Sage. Fox, J. (1997). Applied regression analysis, linear models, and related methods. Thousand Oaks, CA: Sage.
37.
go back to reference Rossi, N., Wang, X., & Ramsay, J. O. (2002). Nonparametric item response function estimates with the EM algorititm. Journal of Educational and Behavioral Statistics, 27, 291–317.CrossRef Rossi, N., Wang, X., & Ramsay, J. O. (2002). Nonparametric item response function estimates with the EM algorititm. Journal of Educational and Behavioral Statistics, 27, 291–317.CrossRef
38.
go back to reference Van Ginkel, J. R., & Van der Ark, L. A. (2005). SPSS syntax for missing value imputation in test and questionnaire data. Applied Psychological Measurement, 29, 152–153.CrossRef Van Ginkel, J. R., & Van der Ark, L. A. (2005). SPSS syntax for missing value imputation in test and questionnaire data. Applied Psychological Measurement, 29, 152–153.CrossRef
39.
go back to reference Reise, S. P., & Waller, N. G. (1990). Fitting the two-parameter model to personality data. Applied Psychological Measurement, 14, 45–58.CrossRef Reise, S. P., & Waller, N. G. (1990). Fitting the two-parameter model to personality data. Applied Psychological Measurement, 14, 45–58.CrossRef
40.
go back to reference Hemker, B. T., Sijtsma, K., & Molenaar, I. W. (1995). Selection of unidimensional scales from a multidimensional itembank in the polytomous IRT model. Applied Psychological Measurement, 19, 337–352.CrossRef Hemker, B. T., Sijtsma, K., & Molenaar, I. W. (1995). Selection of unidimensional scales from a multidimensional itembank in the polytomous IRT model. Applied Psychological Measurement, 19, 337–352.CrossRef
41.
go back to reference Thissen, D., Chen, W.-H., & Bock, R. D. (2003). Multilog (version 7) [computer sotware]. Lincolnwood, IL: Scientific Software International. Thissen, D., Chen, W.-H., & Bock, R. D. (2003). Multilog (version 7) [computer sotware]. Lincolnwood, IL: Scientific Software International.
42.
go back to reference Ostini, R., & Nering, M. L. (2006). Polytomous item response theory models. Thousand Oaks, CA: Sage. Ostini, R., & Nering, M. L. (2006). Polytomous item response theory models. Thousand Oaks, CA: Sage.
43.
go back to reference Sinharay, S., Johnson, M. S., & Stern, H. S. (2006). Posterior predictive assessment of item response theory models. Applied Psychological Measurement, 30, 298–321.CrossRef Sinharay, S., Johnson, M. S., & Stern, H. S. (2006). Posterior predictive assessment of item response theory models. Applied Psychological Measurement, 30, 298–321.CrossRef
44.
go back to reference Allen, M.J., & Yen, W. M. (1979). Introduction to measurement theory. Belmont, CA: Wadsworth. Allen, M.J., & Yen, W. M. (1979). Introduction to measurement theory. Belmont, CA: Wadsworth.
45.
go back to reference Hays, R.D., Brodsky, M., Johnston, M. F., Spritzer, K. L., & Hui, K. (2005). Evaluating the statistical significance of health-related quality of life change in individual patients. Evaluation and the Health Professions, 28, 160–171.CrossRef Hays, R.D., Brodsky, M., Johnston, M. F., Spritzer, K. L., & Hui, K. (2005). Evaluating the statistical significance of health-related quality of life change in individual patients. Evaluation and the Health Professions, 28, 160–171.CrossRef
46.
go back to reference Hemker, B. T., Van der Ark, L. A., & Sijtsma, K. (2001). On measurement properties of continuation ratio models. Psychometrika, 66, 487–506.CrossRef Hemker, B. T., Van der Ark, L. A., & Sijtsma, K. (2001). On measurement properties of continuation ratio models. Psychometrika, 66, 487–506.CrossRef
47.
go back to reference Crane, P. K., Gibbons, L. E., Narasimhalu, K., Lai, J.-S., & Cella, D. (2007). Rapid detection of differential item functioning in assessments of health-related quality of life: The functional assessment of cancer therapy. Quality of Life Research, 16, 101–114.PubMedCrossRef Crane, P. K., Gibbons, L. E., Narasimhalu, K., Lai, J.-S., & Cella, D. (2007). Rapid detection of differential item functioning in assessments of health-related quality of life: The functional assessment of cancer therapy. Quality of Life Research, 16, 101–114.PubMedCrossRef
48.
go back to reference Emons, W. H. M., Sijtsma, K., & Meijer, R. R. (2005). Global, local, and graphical person-fit analysis using person response functions. Psychological Methods, 10, 101–119.PubMedCrossRef Emons, W. H. M., Sijtsma, K., & Meijer, R. R. (2005). Global, local, and graphical person-fit analysis using person response functions. Psychological Methods, 10, 101–119.PubMedCrossRef
49.
go back to reference Bjorner, J. B., Kosinski, M., & Ware, J. E. jr. (2003). Calibration of an item pool for assessing the burden of headaches: An application of item response theory to the Headache Impact Test (HIT™). Quality of Life Research, 12, 913–933. Bjorner, J. B., Kosinski, M., & Ware, J. E. jr. (2003). Calibration of an item pool for assessing the burden of headaches: An application of item response theory to the Headache Impact Test (HIT™). Quality of Life Research, 12, 913–933.
50.
go back to reference Bjorner, J. B., Kosinski, M., & Ware, J. E. jr. (2003). The feasibility of applying item response theory to measures of migraine impact: A re-analysis of three clinical studies. Quality of Life Research, 12, 887–902. Bjorner, J. B., Kosinski, M., & Ware, J. E. jr. (2003). The feasibility of applying item response theory to measures of migraine impact: A re-analysis of three clinical studies. Quality of Life Research, 12, 887–902.
51.
go back to reference Fliege, H., Becker, J., Walter, O. B., Bjorner, J. B., Klapp, B. F., & Rose, M. (2005). Development of a computer-adaptive test for depression (d-cat). Quality of Life Research, 14, 2277–2291.PubMedCrossRef Fliege, H., Becker, J., Walter, O. B., Bjorner, J. B., Klapp, B. F., & Rose, M. (2005). Development of a computer-adaptive test for depression (d-cat). Quality of Life Research, 14, 2277–2291.PubMedCrossRef
52.
go back to reference Lai, J.-S., Cella, D., Chang, C.-H., Bode, R. K., & Heinemann, A. W. (2003). Item banking to improve, shorten and computerize self-reported fatigue: An illustration of steps to create a core item bank from the facit-fatigue scale. Quality of Life Research, 12, 485–501.PubMedCrossRef Lai, J.-S., Cella, D., Chang, C.-H., Bode, R. K., & Heinemann, A. W. (2003). Item banking to improve, shorten and computerize self-reported fatigue: An illustration of steps to create a core item bank from the facit-fatigue scale. Quality of Life Research, 12, 485–501.PubMedCrossRef
53.
go back to reference Petersen, M. A., Groenvold, M., Aaronson, N., Fayers, P., Sprangers, M., Bjorner, J. B., et al. (2006). Multidimensional computerized adaptive testing of the EORTC QLQ-C30: Basic developments and evaluations. Quality of Life Research, 15, 315–329.PubMedCrossRef Petersen, M. A., Groenvold, M., Aaronson, N., Fayers, P., Sprangers, M., Bjorner, J. B., et al. (2006). Multidimensional computerized adaptive testing of the EORTC QLQ-C30: Basic developments and evaluations. Quality of Life Research, 15, 315–329.PubMedCrossRef
54.
go back to reference Schwartz, C., Welch, G., Santiago-Kelley, P., Bode, R., & Sun, X. (2006). Computerized adaptive testing of diabetes impact: A feasibility study of hispanics and non-hispanics in an active clinic population. Quality of Life Research, 15, 1503–1518.PubMedCrossRef Schwartz, C., Welch, G., Santiago-Kelley, P., Bode, R., & Sun, X. (2006). Computerized adaptive testing of diabetes impact: A feasibility study of hispanics and non-hispanics in an active clinic population. Quality of Life Research, 15, 1503–1518.PubMedCrossRef
Metadata
Title
Nonparametric IRT analysis of Quality-of-Life Scales and its application to the World Health Organization Quality-of-Life Scale (WHOQOL-Bref)
Authors
Klaas Sijtsma
Wilco H. M. Emons
Samantha Bouwmeester
Ivan Nyklíček
Leo D. Roorda
Publication date
01-03-2008
Publisher
Springer Netherlands
Published in
Quality of Life Research / Issue 2/2008
Print ISSN: 0962-9343
Electronic ISSN: 1573-2649
DOI
https://doi.org/10.1007/s11136-007-9281-6

Other articles of this Issue 2/2008

Quality of Life Research 2/2008 Go to the issue