Skip to main content
Top
Published in: Chiropractic & Manual Therapies 1/2020

01-12-2020 | Magnetic Resonance Imaging | Research

Degenerative findings in lumbar spine MRI: an inter-rater reliability study involving three raters

Authors: Klaus Doktor, Tue Secher Jensen, Henrik Wulff Christensen, Ulrich Fredberg, Morten Kindt, Eleanor Boyle, Jan Hartvigsen

Published in: Chiropractic & Manual Therapies | Issue 1/2020

Login to get access

Abstract

Background

For diagnostic procedures to be clinically useful, they must be reliable. The interpretation of lumbar spine MRI scans is subject to variability and there is a lack of studies where reliability of multiple degenerative pathologies are rated simultaneously. The objective of our study was to determine the inter-rater reliability of three independent raters evaluating degenerative pathologies seen with lumbar spine MRI.

Methods

Fifty-nine people, 35 patients with low back pain (LBP) or LBP and leg pain and 24 people without LBP or leg pain, received an MRI of the lumbar spine. Three raters (one radiologist and two chiropractors) evaluated the MRIs for the presence and severity of eight degenerative spinal pathologies using a standardized format: Spondylolisthesis, scoliosis, annular fissure, disc degeneration, disc contour, nerve root compromise, spinal stenosis and facet joint degeneration. Findings were identified and classified at disc level according to type and severity. Raters were instructed to evaluate all study sample persons once to assess inter-rater reliability (fully crossed design). Reliability was calculated using Gwet’s Agreement Coefficients (AC1 and AC2) and Cohen’s Kappa (κ) and Conger’s extension of Cohen’s. Gwet’s probabilistic benchmarking method to the Landis and Koch scale was used. MRI-findings achieving substantial reliability was considered acceptable.

Results

Inter-rater reliability for all raters combined, ranged from (Gwet’s AC1 or AC2): 0.64–0.99 and according to probabilistic benchmarking to the Landis and Koch scale equivalent to moderate to almost perfect reliability. Overall reliability level for individual pathologies was almost perfect reliability for spondylolisthesis, spinal stenosis, scoliosis and annular fissure, substantial for nerve root compromise and disc degeneration, and moderate for facet joint degeneration and disc contour.

Conclusion

Inter-rater reliability for 3 raters, evaluating 177 disc levels, was found to be overall acceptable for 6 out of 8 degenerative MRI-findings in the lumbar spine. Ratings of facet joint degeneration and disc contour achieved moderate reliability and was considered unacceptable.
Appendix
Available only for authorised users
Literature
1.
go back to reference Cascade PN, Webster EW, Kazerooni EA. Ineffective use of radiology: the hidden cost. AJR Am J Roentgenol. 1998;170(3):561–4.CrossRef Cascade PN, Webster EW, Kazerooni EA. Ineffective use of radiology: the hidden cost. AJR Am J Roentgenol. 1998;170(3):561–4.CrossRef
2.
go back to reference Deyo RA, Diehl AK, Rosenthal M. Reducing roentgenography use. Can patient expectations be altered? Arch Intern Med. 1987;147(1):141–5.CrossRef Deyo RA, Diehl AK, Rosenthal M. Reducing roentgenography use. Can patient expectations be altered? Arch Intern Med. 1987;147(1):141–5.CrossRef
3.
go back to reference A multicentre audit of hospital referral for radiological investigation in England and Wales. Royal College of Radiologists Working Party. BMJ (Clinical research ed). 1991;303(6806):809–12.CrossRef A multicentre audit of hospital referral for radiological investigation in England and Wales. Royal College of Radiologists Working Party. BMJ (Clinical research ed). 1991;303(6806):809–12.CrossRef
4.
go back to reference Chou R, Deyo RA, Jarvik JG. Appropriate use of lumbar imaging for evaluation of low back pain. Radiol Clin N Am. 2012;50(4):569–85.CrossRef Chou R, Deyo RA, Jarvik JG. Appropriate use of lumbar imaging for evaluation of low back pain. Radiol Clin N Am. 2012;50(4):569–85.CrossRef
5.
go back to reference Carrino JA, Lurie JD, Tosteson AN, Tosteson TD, Carragee EJ, Kaiser J, et al. Lumbar spine: reliability of MR imaging findings. Radiology. 2009;250(1):161–70.CrossRef Carrino JA, Lurie JD, Tosteson AN, Tosteson TD, Carragee EJ, Kaiser J, et al. Lumbar spine: reliability of MR imaging findings. Radiology. 2009;250(1):161–70.CrossRef
6.
go back to reference Jarvik JG, Deyo RA. Moderate versus mediocre: the reliability of spine MR data interpretations. Radiology. 2009;250(1):15–7.CrossRef Jarvik JG, Deyo RA. Moderate versus mediocre: the reliability of spine MR data interpretations. Radiology. 2009;250(1):15–7.CrossRef
7.
go back to reference Speciale AC, Pietrobon R, Urban CW, Richardson WJ, Helms CA, Major N, et al. Observer variability in assessing lumbar spinal stenosis severity on magnetic resonance imaging and its relation to cross-sectional spinal canal area. Spine (Phila Pa 1976). 2002;27(10):1082–6.CrossRef Speciale AC, Pietrobon R, Urban CW, Richardson WJ, Helms CA, Major N, et al. Observer variability in assessing lumbar spinal stenosis severity on magnetic resonance imaging and its relation to cross-sectional spinal canal area. Spine (Phila Pa 1976). 2002;27(10):1082–6.CrossRef
8.
go back to reference Fu MC, Buerba RA, Long WD 3rd, Blizzard DJ, Lischuk AW, Haims AH, et al. Interrater and intrarater agreements of magnetic resonance imaging findings in the lumbar spine: significant variability across degenerative conditions. Spine J. 2014;14(10):2442–8.CrossRef Fu MC, Buerba RA, Long WD 3rd, Blizzard DJ, Lischuk AW, Haims AH, et al. Interrater and intrarater agreements of magnetic resonance imaging findings in the lumbar spine: significant variability across degenerative conditions. Spine J. 2014;14(10):2442–8.CrossRef
9.
go back to reference Li Y, Fredrickson V, Resnick DK. How should we grade lumbar disc herniation and nerve root compression? A systematic review. Clin Orthop Relat Res. 2015;473(6):1896–902.CrossRef Li Y, Fredrickson V, Resnick DK. How should we grade lumbar disc herniation and nerve root compression? A systematic review. Clin Orthop Relat Res. 2015;473(6):1896–902.CrossRef
10.
go back to reference van Rijn JC, Klemetso N, Reitsma JB, Majoie CB, Hulsmans FJ, Peul WC, et al. Observer variation in MRI evaluation of patients suspected of lumbar disk herniation. AJR Am J Roentgenol. 2005;184(1):299–303.CrossRef van Rijn JC, Klemetso N, Reitsma JB, Majoie CB, Hulsmans FJ, Peul WC, et al. Observer variation in MRI evaluation of patients suspected of lumbar disk herniation. AJR Am J Roentgenol. 2005;184(1):299–303.CrossRef
11.
go back to reference Lurie JD, Tosteson AN, Tosteson TD, Carragee E, Carrino JA, Kaiser J, et al. Reliability of magnetic resonance imaging readings for lumbar disc herniation in the spine patient outcomes research trial (SPORT). Spine (Phila Pa 1976). 2008;33(9):991–8.CrossRef Lurie JD, Tosteson AN, Tosteson TD, Carragee E, Carrino JA, Kaiser J, et al. Reliability of magnetic resonance imaging readings for lumbar disc herniation in the spine patient outcomes research trial (SPORT). Spine (Phila Pa 1976). 2008;33(9):991–8.CrossRef
12.
go back to reference Lurie JD, Tosteson AN, Tosteson TD, Carragee E, Carrino JA, Kaiser J, et al. Reliability of readings of magnetic resonance imaging features of lumbar spinal stenosis. Spine (Phila Pa 1976). 2008;33(14):1605–10.CrossRef Lurie JD, Tosteson AN, Tosteson TD, Carragee E, Carrino JA, Kaiser J, et al. Reliability of readings of magnetic resonance imaging features of lumbar spinal stenosis. Spine (Phila Pa 1976). 2008;33(14):1605–10.CrossRef
13.
go back to reference Jensen TS, Sorensen JS, Kjaer P. Intra- and interobserver reproducibility of vertebral endplate signal (modic) changes in the lumbar spine: the Nordic Modic consensus group classification. Acta Radiol. 2007;48(7):748–54.CrossRef Jensen TS, Sorensen JS, Kjaer P. Intra- and interobserver reproducibility of vertebral endplate signal (modic) changes in the lumbar spine: the Nordic Modic consensus group classification. Acta Radiol. 2007;48(7):748–54.CrossRef
14.
go back to reference Kovacs FM, Royuela A, Jensen TS, Estremera A, Amengual G, Muriel A, et al. Agreement in the interpretation of magnetic resonance images of the lumbar spine. Acta Radiol. 2009;50(5):497–506.CrossRef Kovacs FM, Royuela A, Jensen TS, Estremera A, Amengual G, Muriel A, et al. Agreement in the interpretation of magnetic resonance images of the lumbar spine. Acta Radiol. 2009;50(5):497–506.CrossRef
15.
go back to reference Mulconrey DS, Knight RQ, Bramble JD, Paknikar S, Harty PA. Interobserver reliability in the interpretation of diagnostic lumbar MRI and nuclear imaging. Spine J. 2006;6(2):177–84.CrossRef Mulconrey DS, Knight RQ, Bramble JD, Paknikar S, Harty PA. Interobserver reliability in the interpretation of diagnostic lumbar MRI and nuclear imaging. Spine J. 2006;6(2):177–84.CrossRef
16.
go back to reference de Zoete A, Ostelo R, Knol DL, Algra PR, Wilmink JT, van Tulder MW, et al. Diagnostic accuracy of lumbosacral spine magnetic resonance image Reading by chiropractors, chiropractic radiologists, and medical radiologists. Spine (Phila Pa 1976). 2015;40(11):E653–60.CrossRef de Zoete A, Ostelo R, Knol DL, Algra PR, Wilmink JT, van Tulder MW, et al. Diagnostic accuracy of lumbosacral spine magnetic resonance image Reading by chiropractors, chiropractic radiologists, and medical radiologists. Spine (Phila Pa 1976). 2015;40(11):E653–60.CrossRef
17.
go back to reference Moll LT, Kindt MW, Stapelfeldt CM, Jensen TS. Degenerative findings on MRI of the cervical spine: an inter- and intra-rater reliability study. Chiropr Man Therap. 2018;26:43.CrossRef Moll LT, Kindt MW, Stapelfeldt CM, Jensen TS. Degenerative findings on MRI of the cervical spine: an inter- and intra-rater reliability study. Chiropr Man Therap. 2018;26:43.CrossRef
18.
go back to reference Kottner J, Audige L, Brorson S, Donner A, Gajewski BJ, Hrobjartsson A, et al. Guidelines for reporting reliability and agreement studies (GRRAS) were proposed. J Clin Epidemiol. 2011;64(1):96–106.CrossRef Kottner J, Audige L, Brorson S, Donner A, Gajewski BJ, Hrobjartsson A, et al. Guidelines for reporting reliability and agreement studies (GRRAS) were proposed. J Clin Epidemiol. 2011;64(1):96–106.CrossRef
20.
go back to reference Doktor K, Vilholm ML, Hardardottir A, Christensen HW, Lauritsen J. European guidelines on quality criteria for diagnostic radiographic images of the lumbar spine - an intra- and inter-observer reproducibility study. Chiropr Man Therap. 2019;27:20.CrossRef Doktor K, Vilholm ML, Hardardottir A, Christensen HW, Lauritsen J. European guidelines on quality criteria for diagnostic radiographic images of the lumbar spine - an intra- and inter-observer reproducibility study. Chiropr Man Therap. 2019;27:20.CrossRef
21.
go back to reference Masharawi Y, Kjaer P, Bendix T, Manniche C, Wedderkopp N, Sorensen JS, et al. The reproducibility of quantitative measurements in lumbar magnetic resonance imaging of children from the general population. Spine (Phila Pa 1976). 2008;33(19):2094–100.CrossRef Masharawi Y, Kjaer P, Bendix T, Manniche C, Wedderkopp N, Sorensen JS, et al. The reproducibility of quantitative measurements in lumbar magnetic resonance imaging of children from the general population. Spine (Phila Pa 1976). 2008;33(19):2094–100.CrossRef
22.
go back to reference Pfirrmann CW, Metzdorf A, Zanetti M, Hodler J, Boos N. Magnetic resonance classification of lumbar intervertebral disc degeneration. Spine (Phila Pa 1976). 2001;26(17):1873–8.CrossRef Pfirrmann CW, Metzdorf A, Zanetti M, Hodler J, Boos N. Magnetic resonance classification of lumbar intervertebral disc degeneration. Spine (Phila Pa 1976). 2001;26(17):1873–8.CrossRef
23.
go back to reference Aprill C, Bogduk N. High-intensity zone: a diagnostic sign of painful lumbar disc on magnetic resonance imaging. Br J Radiol. 1992;65(773):361–9.CrossRef Aprill C, Bogduk N. High-intensity zone: a diagnostic sign of painful lumbar disc on magnetic resonance imaging. Br J Radiol. 1992;65(773):361–9.CrossRef
24.
go back to reference Lee S, Lee JW, Yeom JS, Kim KJ, Kim HJ, Chung SK, et al. A practical MRI grading system for lumbar foraminal stenosis. AJR Am J Roentgenol. 2010;194(4):1095–8.CrossRef Lee S, Lee JW, Yeom JS, Kim KJ, Kim HJ, Chung SK, et al. A practical MRI grading system for lumbar foraminal stenosis. AJR Am J Roentgenol. 2010;194(4):1095–8.CrossRef
25.
go back to reference Wildermuth S, Zanetti M, Duewell S, Schmid MR, Romanowski B, Benini A, et al. Lumbar spine: quantitative and qualitative assessment of positional (upright flexion and extension) MR imaging and myelography. Radiology. 1998;207(2):391–8.CrossRef Wildermuth S, Zanetti M, Duewell S, Schmid MR, Romanowski B, Benini A, et al. Lumbar spine: quantitative and qualitative assessment of positional (upright flexion and extension) MR imaging and myelography. Radiology. 1998;207(2):391–8.CrossRef
26.
go back to reference Modic MT, Masaryk TJ, Ross JS, Carter JR. Imaging of degenerative disk disease. Radiology. 1988;168(1):177–86.CrossRef Modic MT, Masaryk TJ, Ross JS, Carter JR. Imaging of degenerative disk disease. Radiology. 1988;168(1):177–86.CrossRef
27.
go back to reference Meyerding HW. Spondylolisthesis; surgical fusion of lumbosacral portion of spinal column and interarticular facets; use of autogenous bone grafts for relief of disabling backache. J Int Coll Surg. 1956;26(5 Part 1):566–91.PubMed Meyerding HW. Spondylolisthesis; surgical fusion of lumbosacral portion of spinal column and interarticular facets; use of autogenous bone grafts for relief of disabling backache. J Int Coll Surg. 1956;26(5 Part 1):566–91.PubMed
28.
go back to reference Goldstein LA, Waugh TR. Classification and terminology of scoliosis. Clin Orthop Relat Res. 1973;93:10–22.CrossRef Goldstein LA, Waugh TR. Classification and terminology of scoliosis. Clin Orthop Relat Res. 1973;93:10–22.CrossRef
29.
go back to reference Cobb JR. Scoliosis; quo vadis. J Bone Joint Surg Am. 1958;40-A(3):507–10.CrossRef Cobb JR. Scoliosis; quo vadis. J Bone Joint Surg Am. 1958;40-A(3):507–10.CrossRef
30.
go back to reference Ross JS, Moore KR. Diagnostic imaging spine, 3rd edition. In: Philadelphia. 3rd ed. USA: Elsevier; 2015. p. PA 19103–2899. Ross JS, Moore KR. Diagnostic imaging spine, 3rd edition. In: Philadelphia. 3rd ed. USA: Elsevier; 2015. p. PA 19103–2899.
31.
go back to reference Pathria M. Imaging of spine instability. Semin Musculoskelet Radiol. 2005;9(1):88–99.CrossRef Pathria M. Imaging of spine instability. Semin Musculoskelet Radiol. 2005;9(1):88–99.CrossRef
32.
go back to reference Devereux G, Steele S, Jagelman T, Fielding S, Muirhead R, Brady J, et al. An observational study of matrix metalloproteinase (MMP)-9 in cystic fibrosis. J Cyst Fibros. 2014;13(5):557–63.CrossRef Devereux G, Steele S, Jagelman T, Fielding S, Muirhead R, Brady J, et al. An observational study of matrix metalloproteinase (MMP)-9 in cystic fibrosis. J Cyst Fibros. 2014;13(5):557–63.CrossRef
33.
go back to reference Gwet KL. Handbook of inter-rater reliability - the definitive guide to measuring the extent of agreement among raters/ by Kilem Li Gwet - 4th ed. Gaithersburg, MD 20886–2696. United States of America: Advanced Analytics, LLC; 2014. Gwet KL. Handbook of inter-rater reliability - the definitive guide to measuring the extent of agreement among raters/ by Kilem Li Gwet - 4th ed. Gaithersburg, MD 20886–2696. United States of America: Advanced Analytics, LLC; 2014.
35.
go back to reference Wongpakaran N, Wongpakaran T, Wedding D, Gwet KL. A comparison of Cohen's kappa and Gwet's AC1 when calculating inter-rater reliability coefficients: a study conducted with personality disorder samples. BMC Med Res Methodol. 2013;13:61.CrossRef Wongpakaran N, Wongpakaran T, Wedding D, Gwet KL. A comparison of Cohen's kappa and Gwet's AC1 when calculating inter-rater reliability coefficients: a study conducted with personality disorder samples. BMC Med Res Methodol. 2013;13:61.CrossRef
36.
go back to reference Feinstein AR, Cicchetti DV. High agreement but low kappa: I. the problems of two paradoxes. J Clin Epidemiol. 1990;43(6):543–9.CrossRef Feinstein AR, Cicchetti DV. High agreement but low kappa: I. the problems of two paradoxes. J Clin Epidemiol. 1990;43(6):543–9.CrossRef
37.
go back to reference Cicchetti DV, Feinstein AR. High agreement but low kappa: II. Resolving the paradoxes. J Clin Epidemiol. 1990;43(6):551–8.CrossRef Cicchetti DV, Feinstein AR. High agreement but low kappa: II. Resolving the paradoxes. J Clin Epidemiol. 1990;43(6):551–8.CrossRef
38.
go back to reference Conger AJ. Integration and generalization of Kappas for multiple raters. Psychol Bull. 1980;88:322–8.CrossRef Conger AJ. Integration and generalization of Kappas for multiple raters. Psychol Bull. 1980;88:322–8.CrossRef
39.
go back to reference Landis JR, Koch GG. An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics. 1977;33(2):363–74.CrossRef Landis JR, Koch GG. An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics. 1977;33(2):363–74.CrossRef
41.
go back to reference Herzog R, Elgort DR, Flanders AE, Moley PJ. Variability in diagnostic error rates of 10 MRI centers performing lumbar spine MRI examinations on the same patient within a 3-week period. Spine J. 2017;17(4):554–61.CrossRef Herzog R, Elgort DR, Flanders AE, Moley PJ. Variability in diagnostic error rates of 10 MRI centers performing lumbar spine MRI examinations on the same patient within a 3-week period. Spine J. 2017;17(4):554–61.CrossRef
Metadata
Title
Degenerative findings in lumbar spine MRI: an inter-rater reliability study involving three raters
Authors
Klaus Doktor
Tue Secher Jensen
Henrik Wulff Christensen
Ulrich Fredberg
Morten Kindt
Eleanor Boyle
Jan Hartvigsen
Publication date
01-12-2020

Other articles of this Issue 1/2020

Chiropractic & Manual Therapies 1/2020 Go to the issue