Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2010

Open Access 01-12-2010 | Research article

Methodological issues regarding power of classical test theory (CTT) and item response theory (IRT)-based approaches for the comparison of patient-reported outcomes in two groups of patients - a simulation study

Authors: Véronique Sébille, Jean-Benoit Hardouin, Tanguy Le Néel, Gildas Kubis, François Boyer, Francis Guillemin, Bruno Falissard

Published in: BMC Medical Research Methodology | Issue 1/2010

Login to get access

Abstract

Background

Patients-Reported Outcomes (PRO) are increasingly used in clinical and epidemiological research. Two main types of analytical strategies can be found for these data: classical test theory (CTT) based on the observed scores and models coming from Item Response Theory (IRT). However, whether IRT or CTT would be the most appropriate method to analyse PRO data remains unknown. The statistical properties of CTT and IRT, regarding power and corresponding effect sizes, were compared.

Methods

Two-group cross-sectional studies were simulated for the comparison of PRO data using IRT or CTT-based analysis. For IRT, different scenarios were investigated according to whether items or person parameters were assumed to be known, to a certain extent for item parameters, from good to poor precision, or unknown and therefore had to be estimated. The powers obtained with IRT or CTT were compared and parameters having the strongest impact on them were identified.

Results

When person parameters were assumed to be unknown and items parameters to be either known or not, the power achieved using IRT or CTT were similar and always lower than the expected power using the well-known sample size formula for normally distributed endpoints. The number of items had a substantial impact on power for both methods.

Conclusion

Without any missing data, IRT and CTT seem to provide comparable power. The classical sample size formula for CTT seems to be adequate under some conditions but is not appropriate for IRT. In IRT, it seems important to take account of the number of items to obtain an accurate formula.
Appendix
Available only for authorised users
Literature
1.
go back to reference Gotay CC, Kawamoto CT, Bottomley A, Efficace F: The prognostic significance of Patient-Reported Outcomes in cancer clinical trials. Journal of Clinical Oncology. 2008, 26: 1355-1363. 10.1200/JCO.2007.13.3439.CrossRefPubMed Gotay CC, Kawamoto CT, Bottomley A, Efficace F: The prognostic significance of Patient-Reported Outcomes in cancer clinical trials. Journal of Clinical Oncology. 2008, 26: 1355-1363. 10.1200/JCO.2007.13.3439.CrossRefPubMed
2.
go back to reference Gotay CC, Lipscomb J, Snyder CF: Reflections on findings of the cancer outcomes measurement working group: moving to the next phase. Journal of the National Cancer Institute. 2005, 97: 1568-1574.CrossRefPubMed Gotay CC, Lipscomb J, Snyder CF: Reflections on findings of the cancer outcomes measurement working group: moving to the next phase. Journal of the National Cancer Institute. 2005, 97: 1568-1574.CrossRefPubMed
3.
go back to reference Cella D, Beaumont JL, Webster KA, Lai J-S, Elting L: Measuring the concerns of cancer patients with low platelets counts: the Functional Assessment of Cancer Therapy - Thrombocytopenia (FACT-Th) questionnaire. Support Care Cancer. 2006, 14: 1220-1231. 10.1007/s00520-006-0102-1.CrossRefPubMed Cella D, Beaumont JL, Webster KA, Lai J-S, Elting L: Measuring the concerns of cancer patients with low platelets counts: the Functional Assessment of Cancer Therapy - Thrombocytopenia (FACT-Th) questionnaire. Support Care Cancer. 2006, 14: 1220-1231. 10.1007/s00520-006-0102-1.CrossRefPubMed
4.
go back to reference Bjorner JB, Petersen MAa, Groenvold M, Aaronson N, Ahlner-Elmqvist M, Arraras JI, Brédart A, Fayers P, Jordhoy M, Sprangers M, Watson M, Young T: Use of item response theory to develop a shortened version of the EORTC QLQ-C30 emotional functioning scale. Quality of Life Research. 2004, 13: 1683-1697. 10.1007/s11136-004-7866-x.CrossRefPubMed Bjorner JB, Petersen MAa, Groenvold M, Aaronson N, Ahlner-Elmqvist M, Arraras JI, Brédart A, Fayers P, Jordhoy M, Sprangers M, Watson M, Young T: Use of item response theory to develop a shortened version of the EORTC QLQ-C30 emotional functioning scale. Quality of Life Research. 2004, 13: 1683-1697. 10.1007/s11136-004-7866-x.CrossRefPubMed
5.
go back to reference Garcia SF, Cella D, Clauser SB, Flynn KE, Lad T, Lai J-S, Reeve BB, Smith AW, Stone AA, Weinfurt K: Standardizing Patient-Reported Outcomes assessment in cancer clinical trials: a Patient-Reported Outcomes Measurement Information System initiative. Journal of Clinical Oncology. 2007, 25: 5106-5112. 10.1200/JCO.2007.12.2341.CrossRefPubMed Garcia SF, Cella D, Clauser SB, Flynn KE, Lad T, Lai J-S, Reeve BB, Smith AW, Stone AA, Weinfurt K: Standardizing Patient-Reported Outcomes assessment in cancer clinical trials: a Patient-Reported Outcomes Measurement Information System initiative. Journal of Clinical Oncology. 2007, 25: 5106-5112. 10.1200/JCO.2007.12.2341.CrossRefPubMed
6.
go back to reference Fisher GH, Molenaar IW: Rasch Models, Foundations, Recent Developments, and Applications. 1995, New-York: Springer-Verlag Fisher GH, Molenaar IW: Rasch Models, Foundations, Recent Developments, and Applications. 1995, New-York: Springer-Verlag
7.
go back to reference Rasch G: Probabilistic models for some intelligence and attainment tests. 1980, Chicago: The University of Chicago Press Rasch G: Probabilistic models for some intelligence and attainment tests. 1980, Chicago: The University of Chicago Press
8.
go back to reference Andrich D: A rating formulation for ordered response categories. Psychometrika. 1978, 43: 561-573. 10.1007/BF02293814.CrossRef Andrich D: A rating formulation for ordered response categories. Psychometrika. 1978, 43: 561-573. 10.1007/BF02293814.CrossRef
9.
go back to reference Masters GN: A Rasch model for partial credit scoring. Psychometrika. 1982, 47: 149-174. 10.1007/BF02296272.CrossRef Masters GN: A Rasch model for partial credit scoring. Psychometrika. 1982, 47: 149-174. 10.1007/BF02296272.CrossRef
10.
go back to reference Cohen J: Statistical Power Analysis for the Behavioral Sciences. 1988, New-York: Lawrence Erlbaum Associates Cohen J: Statistical Power Analysis for the Behavioral Sciences. 1988, New-York: Lawrence Erlbaum Associates
11.
go back to reference StataCorp: Stata: Release 11. Statistical Software. 2009, College Station, TX: StataCorp LP StataCorp: Stata: Release 11. Statistical Software. 2009, College Station, TX: StataCorp LP
12.
go back to reference Hardouin J-B: Rasch analysis: estimation and tests wit the raschtest module. The Stata Journal. 2007, 7: 22-44. Hardouin J-B: Rasch analysis: estimation and tests wit the raschtest module. The Stata Journal. 2007, 7: 22-44.
13.
go back to reference Glas CAW: The derivation of some tests for the Rasch model from the multinomial distribution. Psychometrika. 1988, 53: 525-546. 10.1007/BF02294405.CrossRef Glas CAW: The derivation of some tests for the Rasch model from the multinomial distribution. Psychometrika. 1988, 53: 525-546. 10.1007/BF02294405.CrossRef
14.
go back to reference Walters SJ, Campbell MJ: The use of bootstrap methods for estimating sample size and analysing health-related quality of life outcomes. Statistics in Medicine. 2005, 24: 1075-1102. 10.1002/sim.1984.CrossRefPubMed Walters SJ, Campbell MJ: The use of bootstrap methods for estimating sample size and analysing health-related quality of life outcomes. Statistics in Medicine. 2005, 24: 1075-1102. 10.1002/sim.1984.CrossRefPubMed
15.
go back to reference Holman R, Glas CAW, de Haan RJ: Power analysis in randomized clinical trials based on item response theory. Controlled Clinical Trials. 2003, 24: 390-410. 10.1016/S0197-2456(03)00061-8.CrossRefPubMed Holman R, Glas CAW, de Haan RJ: Power analysis in randomized clinical trials based on item response theory. Controlled Clinical Trials. 2003, 24: 390-410. 10.1016/S0197-2456(03)00061-8.CrossRefPubMed
16.
go back to reference Whitehead J: Sample size calculations for ordered categorical data. Statistics in Medicine. 1993, 12: 2257-2271. 10.1002/sim.4780122404. Published erratum appears in Statistics in Medicine 1994, 13:871CrossRefPubMed Whitehead J: Sample size calculations for ordered categorical data. Statistics in Medicine. 1993, 12: 2257-2271. 10.1002/sim.4780122404. Published erratum appears in Statistics in Medicine 1994, 13:871CrossRefPubMed
17.
go back to reference Glas CAW, Geerlings H, Laar van de MAFJ, Taal E: Analysis of longitudinal randomized clinical trials using item response models. Contemporary Clinical Trials. 2009, 30: 158-170. 10.1016/j.cct.2008.12.003.CrossRefPubMed Glas CAW, Geerlings H, Laar van de MAFJ, Taal E: Analysis of longitudinal randomized clinical trials using item response models. Contemporary Clinical Trials. 2009, 30: 158-170. 10.1016/j.cct.2008.12.003.CrossRefPubMed
18.
go back to reference Chang M: Adaptive design theory and implementation using SAS and R. 2008, Boca Raton: Chapman & Hall/CRC Biostatistics series Chang M: Adaptive design theory and implementation using SAS and R. 2008, Boca Raton: Chapman & Hall/CRC Biostatistics series
Metadata
Title
Methodological issues regarding power of classical test theory (CTT) and item response theory (IRT)-based approaches for the comparison of patient-reported outcomes in two groups of patients - a simulation study
Authors
Véronique Sébille
Jean-Benoit Hardouin
Tanguy Le Néel
Gildas Kubis
François Boyer
Francis Guillemin
Bruno Falissard
Publication date
01-12-2010
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2010
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/1471-2288-10-24

Other articles of this Issue 1/2010

BMC Medical Research Methodology 1/2010 Go to the issue