Top

Journal of Imaging Informatics in Medicine

Published in:

01-06-2012

Consensus Versus Disagreement in Imaging Research: a Case Study Using the LIDC Database

Authors: Dmitriy Zinovev, Yujie Duo, Daniela S. Raicu, Jacob Furst, Samuel G. Armato

Published in: Journal of Imaging Informatics in Medicine | Issue 3/2012

Abstract

Traditionally, image studies evaluating the effectiveness of computer-aided diagnosis (CAD) use a single label from a medical expert compared with a single label produced by CAD. The purpose of this research is to present a CAD system based on Belief Decision Tree classification algorithm, capable of learning from probabilistic input (based on intra-reader variability) and providing probabilistic output. We compared our approach against a traditional decision tree approach with respect to a traditional performance metric (accuracy) and a probabilistic one (area under the distance–threshold curve—AuC_dt). The probabilistic classification technique showed notable performance improvement in comparison with the traditional one with respect to both evaluation metrics. Specifically, when applying cross-validation technique on the training subset of instances, boosts of 28.26% and 30.28% were noted for the probabilistic approach with respect to accuracy and AuC_dt, respectively. Furthermore, on the validation subset of instances, boosts of 20.64% and 23.21% were noted again for the probabilistic approach with respect to the same two metrics. In addition, we compared our CAD system results with diagnostic data available for a small subset of the Lung Image Database Consortium database. We discovered that when our CAD system errs, it generally does so with low confidence. Predictions produced by the system also agree with diagnoses of truly benign nodules more often than radiologists, offering the possibility of reducing the false positives.

Bankier AA, Levin D, Halpern EF, Kressel HY: Consensus interpretation in imaging research: is there a better way? Radiology 257:14–17, 2010PubMedCrossRef

Mower WR: Evaluating bias and variability in diagnostic test reports. Ann Emerg Med 33(1):85–91, 1999PubMedCrossRef

Turner DA: Observer variability: what to do until perfect diagnostic tests are invented. J Nucl Med 19(4):435–437, 1978PubMed

Jarvik JG, Deyo RA: Moderate versus mediocre: the reliability of spine MR data interpretations. Radiology 250(1):15–17, 2009PubMedCrossRef

Carrino JA, Lurie JD, Tosteson AN, et al: Lumbar spine: reliability of MR imaging findings. Radiology 250(1):161–170, 2009PubMedCrossRef

MacMahon H, Engelmann R, Behlen F, Hoffmann K, Ishida T, Roe C, Metz C, Doi K: Computer-aided diagnosis of pulmonary nodules: Results of a large-scale observer test. Radiology 13:723–726, 1999

Matsuki Y, Nakamura K, Watanabe H, Aoki T, Nakata H, Katsuragawa S, Doi K: Usefulness of an artificial neural network for differentiating benign from malignant pulmonary nodules on high-resolution CT: evaluation with receiver operating characteristic analysis. Am J Roentgenol 178(3):657–663, 2002

Li F, Aoyama M, Shiraishi J, et al: Radiologists’ performance for differentiating benign from malignant lung nodules on high-resolution CT using computer estimated likelihood of malignancy. Am J Roentgenol 183:1209–1215, 2004

Marten K, Grillhösl A, Seyfarth T, Obenauer S, Rummeny EJ, Engelke C: Computer-assisted detection of pulmonary nodules: evaluation of diagnostic performance using an expert knowledge-based detection system with variable reconstruction slice thickness settings. Eur Radiol 15:203–212, 2005PubMedCrossRef

10.

Peldschus K, Herzog P, Wood SA, Cheema JI, Costello P, Schoepf UJ: Computer-aided diagnosis as a second reader—spectrum of findings in CT studies of the chest interpreted as normal. Chest Journal 128:1517–1523, 2005CrossRef

11.

Baker JA, Rosen EL, Lo JY, Gimenez EI, Walsh R, Soo MS: Computer-aided detection (CAD) in screening mammography: sensitivity of commercial CAD systems for detecting architectural distortion. Am J Roentgenol 181:1083–1088, 2003

12.

Ciatto S, Turco MR, Risso G, et al: Comparison of standard reading and computer-aided detection (CAD) on a national proficiency test of screening mammography. Eur J Radiol 45:135–138, 2003PubMedCrossRef

13.

Karssemeijer N, Risso G, Catarzi S, et al: Computer-aided detection versus independent double reading of masses on mammograms. Radiology 227:192–200, 2003PubMedCrossRef

14.

Muramatsu C, Li Q, Suzuki K, et al: Investigation of psychophysical measure for evaluation of similar images for mammographic masses: Preliminary results. Medical Physics 32:2295–2304, 2005PubMedCrossRef

15.

Fletcher JW, Kymes SM, Gould M, Alazraki N, Coleman RE, Lowe VJ, et al: A comparison of the diagnostic accuracy of 18FFDG PET and CT in the characterization of solitary pulmonary nodules. J Nucl Med 49:179–185, 2008PubMedCrossRef

16.

Tao Y, Lo S-C B, Freedman M T, Xuan J: Joint segmentation and spiculation detection for ill-defined and spiculated mammographic masses. Proc. SPIE, doi:10.1117/12.844045, February 16, 2010

17.

Sahiner B, Hadjiiski L M, Chan H P, Paramagul C, Nees A, Helvie M, Shi J: Concordance of Computer-Extracted Image Features with BI-RADS Descriptors for Mammographic Mass Margin. Proc. SPIE, doi: 10.1117/12.770752, March 17, 2008

18.

Ochs R, Kimb HJ, Angel E, Panknin C, McNitt-Gray M, Brown M: Forming a reference standard from LIDC data: impact of reader agreement on reported CAD performance. Proc. SPIE, DOI: 10.1117/12.707916, March 30, 2007

19.

Opfer R, Wiemker RD: Performance Analysis For Computer-Aided Lung Nodule Detection On LIDC Data. Proc. SPIE, DOI: 10.1117/12.708210, February 21, 2007

20.

Armato III, SG, Roberts RY, Kocherginsky M, Aberle DR, Kazerooni EA, MacMahon H, van Beek EJR, Yankelevitz DF, McLennan G, McNitt-Gray MF, Meyer CR, Reeves AP, Caligiuri P, Quint LE, Sundaram B, Croft BY, Clarke LP: Assessment of radiologist performance in the detection of lung nodules: dependence on the definition of “truth”. Acad Radiol 16:28–38, 2009PubMedCrossRef

21.

Armato III, SG, et al: Lung Image Database Consortium: developing a resource for the medical imaging research community. Radiology 232:739–748, 2004PubMedCrossRef

22.

Hillman BJ: ACRIN—lessons learned in conducting multi-center trials of imaging and cancer. Cancer Imaging 5(Spec No A):S97–S101, 2005PubMedCrossRef

23.

Elouedi Z, Mellouli K, Smets P: Belief decision trees: theoretical foundations. International Journal of Approximate Reasoning 28:91–124, 2001CrossRef

24.

Wu TF, Lin CJ, Weng RC: Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 5(August):975–1005, 2004

25.

Robinson PJA: Radiology’s Achilles’ heel: error and variation in the interpretation of the Rontghen image. Br J Radiol 70:1085–1098, 1997PubMed

26.

Reiner B: Uncovering and improving upon the inherent deficiencies of radiology reporting through data mining. J Digit Imaging 23:109–118, 2010PubMedCrossRef

27.

Zinovev D, Raicu D, Furst J, Armato III, SG: Predicting radiological panel opinions using a panel of machine learning classifiers. Algorithms Journal 2:1473–1502, 2009CrossRef

28.

Quinlan JR: Improved use of continuous attributes in C4.5. Journal of Artificial Intelligence Research 4:77–90, 1996

29.

Spackman KA: Signal detection theory: Valuable tools for evaluating inductive learning. Proc. 6th Int. Workshop on Machine Learning 160–163, 1989

30.

Liu H, Song D, Rüger S, Hu R, Uren V: Comparing dissimilarity measures for content-based image retrieval. Proc. 4th Asia Inf. Ret. Conf. on Information Retrieval Technology 44–50, 2008

Title: Consensus Versus Disagreement in Imaging Research: a Case Study Using the LIDC Database
Authors: Dmitriy Zinovev
Yujie Duo
Daniela S. Raicu
Jacob Furst
Samuel G. Armato
Publication date: 01-06-2012
Publisher: Springer-Verlag
Published in: Journal of Imaging Informatics in Medicine / Issue 3/2012
Print ISSN: 2948-2925
Electronic ISSN: 2948-2933
DOI: https://doi.org/10.1007/s10278-011-9445-3

At a glance: The ONWARDS insulin icodec trials

Springer Medicine

Consensus Versus Disagreement in Imaging Research: a Case Study Using the LIDC Database

Abstract

At a glance: The ONWARDS insulin icodec trials

Springer Medicine

Abstract

Please log in to get access to this content

Other articles of this Issue 3/2012

User Evaluation of an Innovative Digital Reading Room

An Open-Standards Grammar for Outline-Style Radiology Report Templates

Real-Time Occupational Stress and Fatigue Measurement in Medical Imaging Practice

The Skeptical Technophile: iPad Review

Application of Innovation Economics to Medical Imaging and Information Systems Technologies

Automatic Segmentation of Ground-Glass Opacities in Lung CT Images by Using Markov Random Field-Based Algorithms