Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2018

Open Access 01-12-2018 | Research article

Estimation of an inter-rater intra-class correlation coefficient that overcomes common assumption violations in the assessment of health measurement scales

Authors: Carly A. Bobak, Paul J. Barr, A. James O’Malley

Published in: BMC Medical Research Methodology | Issue 1/2018

Login to get access

Abstract

Background

Intraclass correlation coefficients (ICC) are recommended for the assessment of the reliability of measurement scales. However, the ICC is subject to a variety of statistical assumptions such as normality and stable variance, which are rarely considered in health applications.

Methods

A Bayesian approach using hierarchical regression and variance-function modeling is proposed to estimate the ICC with emphasis on accounting for heterogeneous variances across a measurement scale. As an application, we review the implementation of using an ICC to evaluate the reliability of Observer OPTION5, an instrument which used trained raters to evaluate the level of Shared Decision Making between clinicians and patients. The study used two raters to evaluate recordings of 311 clinical encounters across three studies to evaluate the impact of using a Personal Decision Aid over usual care. We particularly focus on deriving an estimate for the ICC when multiple studies are being considered as part of the data.

Results

The results demonstrate that ICC varies substantially across studies and patient-physician encounters within studies. Using the new framework we developed, the study-specific ICCs were estimated to be 0.821, 0.295, and 0.644. If the within- and between-encounter variances were assumed to be the same across studies, the estimated within-study ICC was 0.609. If heteroscedasticity is not properly adjusted for, the within-study ICC estimate was inflated to be as high as 0.640. Finally, if the data were pooled across studies without accounting for the variability between studies then ICC estimates were further inflated by approximately 0.02 while formerly allowing for between study variation in the ICC inflated its estimated value by approximately 0.066 to 0.072 depending on the model.

Conclusion

We demonstrated that misuse of the ICC statistics under common assumption violations leads to misleading and likely inflated estimates of interrater reliability. A statistical analysis that overcomes these violations by expanding the standard statistical model to account for them leads to estimates that are a better reflection of a measurement scale’s reliability while maintaining ease of interpretation. Bayesian methods are particularly well suited to estimating the expanded statistical model.
Literature
1.
go back to reference Fisher RA. On the “probable error” of a coefficient of correlation deduced from a small sample. Metron. 1921; 1:1–32. Fisher RA. On the “probable error” of a coefficient of correlation deduced from a small sample. Metron. 1921; 1:1–32.
3.
go back to reference Strah KM, Love SM. The in situ carcinomas of the breast. J Am Med Women’s Assoc (1972). 1992; 47:165–8. Strah KM, Love SM. The in situ carcinomas of the breast. J Am Med Women’s Assoc (1972). 1992; 47:165–8.
9.
go back to reference Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, Bouter LM, de Vet HC. COSMIN checklist manual. Amst Univ Med Cent. 2012. Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, Bouter LM, de Vet HC. COSMIN checklist manual. Amst Univ Med Cent. 2012.
12.
go back to reference Fisher RA. Statistical methods for research workers. In: Breakthroughs in statistics. Springer: 1992. p. 66–70. Fisher RA. Statistical methods for research workers. In: Breakthroughs in statistics. Springer: 1992. p. 66–70.
13.
go back to reference Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979; 86(2):420.CrossRefPubMed Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979; 86(2):420.CrossRefPubMed
15.
go back to reference Altaye M, Dormer A, Klar N. Inference procedures for assessing interobserver agreement among multiple raters. Biometrics. 2001; 57(2):584–8.CrossRefPubMed Altaye M, Dormer A, Klar N. Inference procedures for assessing interobserver agreement among multiple raters. Biometrics. 2001; 57(2):584–8.CrossRefPubMed
17.
go back to reference Shoukri M, Donner A. Efficiency considerations in the analysis of inter-observer agreement. Biostatistics. 2001; 2(3):323–36.CrossRefPubMed Shoukri M, Donner A. Efficiency considerations in the analysis of inter-observer agreement. Biostatistics. 2001; 2(3):323–36.CrossRefPubMed
18.
go back to reference Konishi S. Normalizing and variance stabilizing transformations for intraclass correlations. Ann Inst Stat Math. 1985; 37(1):87–94.CrossRef Konishi S. Normalizing and variance stabilizing transformations for intraclass correlations. Ann Inst Stat Math. 1985; 37(1):87–94.CrossRef
19.
go back to reference Weinberg R, Patel YC. Simulated intraclass correlation coefficients and their z transforms. J Stat Comput Simul. 1981; 13(1):13–26.CrossRef Weinberg R, Patel YC. Simulated intraclass correlation coefficients and their z transforms. J Stat Comput Simul. 1981; 13(1):13–26.CrossRef
21.
go back to reference Ponzoni R, James J. Possible biases in heritability estimates from intraclass correlation. Theor Appl Genet. 1978; 53(1):25–7.PubMed Ponzoni R, James J. Possible biases in heritability estimates from intraclass correlation. Theor Appl Genet. 1978; 53(1):25–7.PubMed
27.
go back to reference Fitzpatrick R.Surveys of patients satisfaction: I–important general considerations. Br Med J. 1991; 302(6781):887.CrossRef Fitzpatrick R.Surveys of patients satisfaction: I–important general considerations. Br Med J. 1991; 302(6781):887.CrossRef
29.
go back to reference Dillon EC, Stults CD, Wilson C, Chuang J, Meehan A, Li M, Elwyn G, Frosch DL, Yu E, Tai-Seale M. An evaluation of two interventions to enhance patient-physician communication using the observer OPTION5 measure of shared decision making. Patient Educ Couns. 2017; 100(10):1910–7.CrossRefPubMed Dillon EC, Stults CD, Wilson C, Chuang J, Meehan A, Li M, Elwyn G, Frosch DL, Yu E, Tai-Seale M. An evaluation of two interventions to enhance patient-physician communication using the observer OPTION5 measure of shared decision making. Patient Educ Couns. 2017; 100(10):1910–7.CrossRefPubMed
33.
go back to reference Gelman A. Prior distributions for variance parameters in hierarchical models. Bayesian Anal. 2006; 1:1–19.CrossRef Gelman A. Prior distributions for variance parameters in hierarchical models. Bayesian Anal. 2006; 1:1–19.CrossRef
35.
go back to reference Plummer M, et al. JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. In: Proceedings of the 3rd international workshop on distributed statistical computing, Vol. 124 no. 125.10. Vienna: 2003. Plummer M, et al. JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. In: Proceedings of the 3rd international workshop on distributed statistical computing, Vol. 124 no. 125.10. Vienna: 2003.
40.
go back to reference Fried TR, Bradley EH, Towle VR. J Gerontol Ser B Psychol Sci Soc Sci. 2002; 57(6):348–54. Fried TR, Bradley EH, Towle VR. J Gerontol Ser B Psychol Sci Soc Sci. 2002; 57(6):348–54.
42.
go back to reference Giere RN. Bayesian statistics and biased procedures. Synthese. 1969; 20(3):371–87.CrossRef Giere RN. Bayesian statistics and biased procedures. Synthese. 1969; 20(3):371–87.CrossRef
Metadata
Title
Estimation of an inter-rater intra-class correlation coefficient that overcomes common assumption violations in the assessment of health measurement scales
Authors
Carly A. Bobak
Paul J. Barr
A. James O’Malley
Publication date
01-12-2018
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2018
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/s12874-018-0550-6

Other articles of this Issue 1/2018

BMC Medical Research Methodology 1/2018 Go to the issue