Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2014

Open Access 01-12-2014 | Research article

Comparison of confidence interval methods for an intra-class correlation coefficient (ICC)

Authors: Alexei C Ionan, Mei-Yin C Polley, Lisa M McShane, Kevin K Dobbin

Published in: BMC Medical Research Methodology | Issue 1/2014

Login to get access

Abstract

Background

The intraclass correlation coefficient (ICC) is widely used in biomedical research to assess the reproducibility of measurements between raters, labs, technicians, or devices. For example, in an inter-rater reliability study, a high ICC value means that noise variability (between-raters and within-raters) is small relative to variability from patient to patient. A confidence interval or Bayesian credible interval for the ICC is a commonly reported summary. Such intervals can be constructed employing either frequentist or Bayesian methodologies.

Methods

This study examines the performance of three different methods for constructing an interval in a two-way, crossed, random effects model without interaction: the Generalized Confidence Interval method (GCI), the Modified Large Sample method (MLS), and a Bayesian method based on a noninformative prior distribution (NIB). Guidance is provided on interval construction method selection based on study design, sample size, and normality of the data. We compare the coverage probabilities and widths of the different interval methods.

Results

We show that, for the two-way, crossed, random effects model without interaction, care is needed in interval method selection because the interval estimates do not always have properties that the user expects. While different methods generally perform well when there are a large number of levels of each factor, large differences between the methods emerge when the number of one or more factors is limited. In addition, all methods are shown to lack robustness to certain hard-to-detect violations of normality when the sample size is limited.

Conclusions

Decision rules and software programs for interval construction are provided for practical implementation in the two-way, crossed, random effects model without interaction. All interval methods perform similarly when the data are normal and there are sufficient numbers of levels of each factor. The MLS and GCI methods outperform the NIB when one of the factors has a limited number of levels and the data are normally distributed or nearly normally distributed. None of the methods work well if the number of levels of a factor are limited and data are markedly non-normal. The software programs are implemented in the popular R language.
Appendix
Available only for authorised users
Literature
1.
go back to reference Bartko J: Intraclass correlation coefficient as a measure of reliability. Psychol Rep. 1966, 19: 3-11. 10.2466/pr0.1966.19.1.3.CrossRefPubMed Bartko J: Intraclass correlation coefficient as a measure of reliability. Psychol Rep. 1966, 19: 3-11. 10.2466/pr0.1966.19.1.3.CrossRefPubMed
2.
go back to reference Donner A: The use of correlation and regression in the analysis of family resemblance. Am J Epidemiol. 1979, 110 (3): 335-342.PubMed Donner A: The use of correlation and regression in the analysis of family resemblance. Am J Epidemiol. 1979, 110 (3): 335-342.PubMed
3.
go back to reference Wolak M, Fairbairn D, Paulsen Y: Guidelines for estimating repeatability. Methods Ecol Evol. 2012, 3 (1): 129-137. 10.1111/j.2041-210X.2011.00125.x.CrossRef Wolak M, Fairbairn D, Paulsen Y: Guidelines for estimating repeatability. Methods Ecol Evol. 2012, 3 (1): 129-137. 10.1111/j.2041-210X.2011.00125.x.CrossRef
4.
go back to reference Gisev N, Bell J, Chen T: Interrate agreement and interrater reliability: key concepts, approaches, and applications. Res Soc Admin Pharm. 2013, 9 (3): 330-338. 10.1016/j.sapharm.2012.04.004.CrossRef Gisev N, Bell J, Chen T: Interrate agreement and interrater reliability: key concepts, approaches, and applications. Res Soc Admin Pharm. 2013, 9 (3): 330-338. 10.1016/j.sapharm.2012.04.004.CrossRef
5.
go back to reference Berger J: Statistical Decision Theory and Bayesian Analysis. 1985, New York: Springer-Verlag, 2CrossRef Berger J: Statistical Decision Theory and Bayesian Analysis. 1985, New York: Springer-Verlag, 2CrossRef
6.
go back to reference Carlin B, Louis T: Bayesian Methods for Data Analysis. 2009, Boca Raton, FL: Chapman and Hall, 3 Carlin B, Louis T: Bayesian Methods for Data Analysis. 2009, Boca Raton, FL: Chapman and Hall, 3
7.
go back to reference Little R: Calibrated Bayes: a Bayes/frequentist roadmap. Am Stat. 2006, 60: 213-223. 10.1198/000313006X117837.CrossRef Little R: Calibrated Bayes: a Bayes/frequentist roadmap. Am Stat. 2006, 60: 213-223. 10.1198/000313006X117837.CrossRef
8.
go back to reference Rubin D: Bayesianly justifiable and relevant frequency calculations for applied statisticians. Ann Stat. 1984, 12: 1151-1172. 10.1214/aos/1176346785.CrossRef Rubin D: Bayesianly justifiable and relevant frequency calculations for applied statisticians. Ann Stat. 1984, 12: 1151-1172. 10.1214/aos/1176346785.CrossRef
9.
go back to reference Box G: Sampling and Bayes inference in scientific modeling and robustness. J Royal Stat Soc A. 1980, 143: 383-430. 10.2307/2982063.CrossRef Box G: Sampling and Bayes inference in scientific modeling and robustness. J Royal Stat Soc A. 1980, 143: 383-430. 10.2307/2982063.CrossRef
10.
go back to reference Browne W, Draper D: A comparison of Bayesian and likelihood-based methods for fitting multilevel models. Bayesian Anal. 2006, 1 (3): 473-514. Browne W, Draper D: A comparison of Bayesian and likelihood-based methods for fitting multilevel models. Bayesian Anal. 2006, 1 (3): 473-514.
11.
go back to reference Yin G: Bayesian generalized method of moments. Bayesian Anal. 2009, 4: 191-208. 10.1214/09-BA407.CrossRef Yin G: Bayesian generalized method of moments. Bayesian Anal. 2009, 4: 191-208. 10.1214/09-BA407.CrossRef
12.
go back to reference Leonard D: Estimating a bivariate linear relationship. Bayesian Anal. 2011, 6: 727-754.CrossRef Leonard D: Estimating a bivariate linear relationship. Bayesian Anal. 2011, 6: 727-754.CrossRef
13.
go back to reference Bingham M, Vardeman S, Nordman D: Bayes one-sample and one-way random effects analyses for 3-D orientations with application to materials science. Bayesian Anal. 2009, 4: 607-630. 10.1214/09-BA423.CrossRef Bingham M, Vardeman S, Nordman D: Bayes one-sample and one-way random effects analyses for 3-D orientations with application to materials science. Bayesian Anal. 2009, 4: 607-630. 10.1214/09-BA423.CrossRef
14.
go back to reference Samaniego F: A Comparison of the Bayesian and Frequentist Approaches to Estimation. 2010, New York: SpringerCrossRef Samaniego F: A Comparison of the Bayesian and Frequentist Approaches to Estimation. 2010, New York: SpringerCrossRef
15.
go back to reference Barzman D, Mossman D, Sonnier L, Sorter M: Brief rating of aggression by children and adolescents (BRACHA): a reliability study. J Am Acad Psychiatry Law. 2012, 40: 374-382.PubMed Barzman D, Mossman D, Sonnier L, Sorter M: Brief rating of aggression by children and adolescents (BRACHA): a reliability study. J Am Acad Psychiatry Law. 2012, 40: 374-382.PubMed
16.
go back to reference Dobbin K, Beer D, Meyerson M, Yeatman T, Gerald W, Jacobson J, Conley B, Buetow K, Heiskanen M, Simon RM, Minna JD, Girard L, Misek DE, Taylor JM, Hanash S, Naoki K, Hayes DN, Ladd-Acosta C, Enkemann SA, Viale A, Giordano TJ: Interlaboratory comparability study of cancer gene expression analysis using oligonucleotide microarrays. Clin Cancer Res. 2005, 11: 565-572.PubMed Dobbin K, Beer D, Meyerson M, Yeatman T, Gerald W, Jacobson J, Conley B, Buetow K, Heiskanen M, Simon RM, Minna JD, Girard L, Misek DE, Taylor JM, Hanash S, Naoki K, Hayes DN, Ladd-Acosta C, Enkemann SA, Viale A, Giordano TJ: Interlaboratory comparability study of cancer gene expression analysis using oligonucleotide microarrays. Clin Cancer Res. 2005, 11: 565-572.PubMed
17.
go back to reference McShane LM, Aamodt R, Cordon-Cardo C, Cote R, Faraggi D, Fradet Y, Grossman HB, Peng A, Taube SE, Waldman FM: Reproducibility of p53 immunohistochemistry in bladder tumors. National cancer institute, bladder tumor marker network. Clin Cancer Res. 2000, 6 (5): 1854-1864.PubMed McShane LM, Aamodt R, Cordon-Cardo C, Cote R, Faraggi D, Fradet Y, Grossman HB, Peng A, Taube SE, Waldman FM: Reproducibility of p53 immunohistochemistry in bladder tumors. National cancer institute, bladder tumor marker network. Clin Cancer Res. 2000, 6 (5): 1854-1864.PubMed
18.
go back to reference Chen C, Barnhart HX: Comparison of ICC and CCC for assessing agreement for data without and with replications. Comput Stat Data Anal. 2008, 53: 554-564. 10.1016/j.csda.2008.09.026.CrossRef Chen C, Barnhart HX: Comparison of ICC and CCC for assessing agreement for data without and with replications. Comput Stat Data Anal. 2008, 53: 554-564. 10.1016/j.csda.2008.09.026.CrossRef
19.
go back to reference Lin LI, Hedayat AS, Wu WM: Statistical Tools for Measuring Agreement. 2012, New York: SpringerCrossRef Lin LI, Hedayat AS, Wu WM: Statistical Tools for Measuring Agreement. 2012, New York: SpringerCrossRef
20.
go back to reference Montgomery D: Design and Analysis of Experiments. 2013, New York: Wiley, 8 Montgomery D: Design and Analysis of Experiments. 2013, New York: Wiley, 8
21.
go back to reference Searle S, Fawcett R: Expected mean squares in variance components models having finite populations. Biometrics. 1970, 26 (2): 243-254. 10.2307/2529072.CrossRef Searle S, Fawcett R: Expected mean squares in variance components models having finite populations. Biometrics. 1970, 26 (2): 243-254. 10.2307/2529072.CrossRef
22.
go back to reference Lin LI, Hedayat AS, Wu WM: A unified approach for assessing agreement for continuous and categorical data. Biopharm Stat. 2007, 17 (4): 629-652. 10.1080/10543400701376498.CrossRef Lin LI, Hedayat AS, Wu WM: A unified approach for assessing agreement for continuous and categorical data. Biopharm Stat. 2007, 17 (4): 629-652. 10.1080/10543400701376498.CrossRef
23.
go back to reference Cappelleri J, Ting N: A modified large-sample approach to approximate interval estimation for a particular class of intraclass correlation coefficient. Stat Med. 2003, 22: 1861-1877. 10.1002/sim.1402.CrossRefPubMed Cappelleri J, Ting N: A modified large-sample approach to approximate interval estimation for a particular class of intraclass correlation coefficient. Stat Med. 2003, 22: 1861-1877. 10.1002/sim.1402.CrossRefPubMed
24.
go back to reference Graybill F, Wang C: Confidence intervals for nonnegative linear combinations of variances. J Am Stat Assoc. 1980, 75: 869-873. 10.1080/01621459.1980.10477565.CrossRef Graybill F, Wang C: Confidence intervals for nonnegative linear combinations of variances. J Am Stat Assoc. 1980, 75: 869-873. 10.1080/01621459.1980.10477565.CrossRef
25.
go back to reference Burdick R, Borror C, Montgomery D: Design and Analysis of Gauge R&R Studies: Making Decisions with Confidence Intervals in Random and Mixed ANOVA Models. 2005, Alexandria, Virginia: ASA and SIAMCrossRef Burdick R, Borror C, Montgomery D: Design and Analysis of Gauge R&R Studies: Making Decisions with Confidence Intervals in Random and Mixed ANOVA Models. 2005, Alexandria, Virginia: ASA and SIAMCrossRef
26.
go back to reference Arteaga C, Jeyaratnam S, Graybill F: Confidence intervals for proportions of total variance in the two-way cross component of variance model. Commun Stat Theor Methods. 1982, 11: 1643-1658. 10.1080/03610928208828338.CrossRef Arteaga C, Jeyaratnam S, Graybill F: Confidence intervals for proportions of total variance in the two-way cross component of variance model. Commun Stat Theor Methods. 1982, 11: 1643-1658. 10.1080/03610928208828338.CrossRef
27.
go back to reference Weerahandi S: Generalized confidence intervals. J Am Stat Assoc. 1993, 88 (423): 899-905. 10.1080/01621459.1993.10476355.CrossRef Weerahandi S: Generalized confidence intervals. J Am Stat Assoc. 1993, 88 (423): 899-905. 10.1080/01621459.1993.10476355.CrossRef
28.
go back to reference Robert C, Casella G: Monte Carlo Statistical Methods. 2010, New York: Springer Robert C, Casella G: Monte Carlo Statistical Methods. 2010, New York: Springer
29.
go back to reference Gelfand A, Smith A: Sampling based approaches to calculating marginal densities. J Am Stat Assoc. 1990, 85: 398-409. 10.1080/01621459.1990.10476213.CrossRef Gelfand A, Smith A: Sampling based approaches to calculating marginal densities. J Am Stat Assoc. 1990, 85: 398-409. 10.1080/01621459.1990.10476213.CrossRef
30.
go back to reference Tierney L: Markov chains for exploring posterior distributions. Ann Stat. 1991, 22: 1701-1762.CrossRef Tierney L: Markov chains for exploring posterior distributions. Ann Stat. 1991, 22: 1701-1762.CrossRef
31.
go back to reference Metropolis N, Rosenbluth A, Rosenbluth M, Teller A, Teller E: Equations of state calculations by fast computing machines. J Chem Phys. 1953, 21: 1087-1092. 10.1063/1.1699114.CrossRef Metropolis N, Rosenbluth A, Rosenbluth M, Teller A, Teller E: Equations of state calculations by fast computing machines. J Chem Phys. 1953, 21: 1087-1092. 10.1063/1.1699114.CrossRef
32.
go back to reference Thomas A, O’Hara B, Ligges U, Sturtz S: Making BUGS open. R News. 2006, 6: 12-17. Thomas A, O’Hara B, Ligges U, Sturtz S: Making BUGS open. R News. 2006, 6: 12-17.
33.
go back to reference Lunn D, Thomas A, Best N: WinBUGS – a Bayesian modeling framework: concepts, structure and extensibility. Stat Comput. 2000, 10: 325-337. 10.1023/A:1008929526011.CrossRef Lunn D, Thomas A, Best N: WinBUGS – a Bayesian modeling framework: concepts, structure and extensibility. Stat Comput. 2000, 10: 325-337. 10.1023/A:1008929526011.CrossRef
34.
go back to reference Weerahandi S: Exact Statistical Methods for Data Analysis. 2003, New York: Springer-Verlag Weerahandi S: Exact Statistical Methods for Data Analysis. 2003, New York: Springer-Verlag
35.
go back to reference Gelman A: Prior distributions for variance parameters in hierarchical models. Bayesian Anal. 2006, 1 (3): 515-533. Gelman A: Prior distributions for variance parameters in hierarchical models. Bayesian Anal. 2006, 1 (3): 515-533.
36.
go back to reference Hadfield J: MCMC methods for multi-response generalized linear mixed models: the MCMCglmm R package. J Stat Software. 2010, 33 (2): 1-22.CrossRef Hadfield J: MCMC methods for multi-response generalized linear mixed models: the MCMCglmm R package. J Stat Software. 2010, 33 (2): 1-22.CrossRef
37.
go back to reference Box G, Cox D: An analysis of transformations (with discussion). J Royal Stat Soc B. 1964, 26: 211-252. Box G, Cox D: An analysis of transformations (with discussion). J Royal Stat Soc B. 1964, 26: 211-252.
38.
go back to reference John J, Draper N: An alternative family of transformations. Appl Stat. 1980, 29: 190-197. 10.2307/2986305.CrossRef John J, Draper N: An alternative family of transformations. Appl Stat. 1980, 29: 190-197. 10.2307/2986305.CrossRef
39.
go back to reference Li C, Wing WH: Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc Natl Acad Sci U S A. 2001, 98 (1): 31-36. 10.1073/pnas.98.1.31.CrossRefPubMed Li C, Wing WH: Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc Natl Acad Sci U S A. 2001, 98 (1): 31-36. 10.1073/pnas.98.1.31.CrossRefPubMed
40.
go back to reference Muller P, Quintana F: Nonparametric Bayesian data analysis. Statistical Science. 2004, 19 (1): 95-110. 10.1214/088342304000000017.CrossRef Muller P, Quintana F: Nonparametric Bayesian data analysis. Statistical Science. 2004, 19 (1): 95-110. 10.1214/088342304000000017.CrossRef
41.
go back to reference Lehman E, Cassella G: Theory of Point Estimation. 1998, New York: Springer Lehman E, Cassella G: Theory of Point Estimation. 1998, New York: Springer
Metadata
Title
Comparison of confidence interval methods for an intra-class correlation coefficient (ICC)
Authors
Alexei C Ionan
Mei-Yin C Polley
Lisa M McShane
Kevin K Dobbin
Publication date
01-12-2014
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2014
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/1471-2288-14-121

Other articles of this Issue 1/2014

BMC Medical Research Methodology 1/2014 Go to the issue