Skip to main content
Top
Published in: Journal of Digital Imaging 5/2018

01-10-2018

Proposing New RadLex Terms by Analyzing Free-Text Mammography Reports

Authors: Hakan Bulu, Dorothy A. Sippo, Janie M. Lee, Elizabeth S. Burnside, Daniel L. Rubin

Published in: Journal of Imaging Informatics in Medicine | Issue 5/2018

Login to get access

Abstract

After years of development, the RadLex terminology contains a large set of controlled terms for the radiology domain, but gaps still exist. We developed a data-driven approach to discover new terms for RadLex by mining a large corpus of radiology reports using natural language processing (NLP) methods. Our system, developed for mammography, discovers new candidate terms by analyzing noun phrases in free-text reports to extend the mammography part of RadLex. Our NLP system extracts noun phrases from free-text mammography reports and classifies these noun phrases as “Has Candidate RadLex Term” or “Does Not Have Candidate RadLex Term.” We tested the performance of our algorithm using 100 free-text mammography reports. An expert radiologist determined the true positive and true negative RadLex candidate terms. We calculated precision/positive predictive value and recall/sensitivity metrics to judge the system’s performance. Finally, to identify new candidate terms for enhancing RadLex, we applied our NLP method to 270,540 free-text mammography reports obtained from three academic institutions. Our method demonstrated precision/positive predictive value of 0.77 (159/206 terms) and a recall/sensitivity of 0.94 (159/170 terms). The overall accuracy of the system is 0.80 (235/293 terms). When we ran our system on the set of 270,540 reports, it found 31,800 unique noun phrases that are potential candidates for RadLex. Our data-driven approach to mining radiology reports can identify new candidate terms for expanding the breast imaging lexicon portion of RadLex and may be a useful approach for discovering new candidate terms from other radiology domains.
Literature
2.
go back to reference Beitia AO, Kuperman G, Delman BN, Shapiro JS: Assessing the performance of LOINC® and RadLex for coverage of CT scans across three sites in a health information exchange. AMIA Annu Symp Proc 94–102, Nov 16, 2013 Beitia AO, Kuperman G, Delman BN, Shapiro JS: Assessing the performance of LOINC® and RadLex for coverage of CT scans across three sites in a health information exchange. AMIA Annu Symp Proc 94–102, Nov 16, 2013
3.
go back to reference Woods RW, Eng J: Evaluating the completeness of RadLex in the chest radiography domain. Acad Radiol 20(11):1329–1333, 2013CrossRef Woods RW, Eng J: Evaluating the completeness of RadLex in the chest radiography domain. Acad Radiol 20(11):1329–1333, 2013CrossRef
4.
go back to reference American College of Radiology: Breast Imaging Reporting and Data System (BI-RADS). Reston, VA: American College of Radiology, 1992 American College of Radiology: Breast Imaging Reporting and Data System (BI-RADS). Reston, VA: American College of Radiology, 1992
5.
go back to reference D’Orsi CJ, Sickles EA, Mendelson EB, Morris EA et al.: ACR BI-RADS® Atlas, Breast Imaging Reporting and Data System. American College of Radiology: Reston, VA, 2013 D’Orsi CJ, Sickles EA, Mendelson EB, Morris EA et al.: ACR BI-RADS® Atlas, Breast Imaging Reporting and Data System. American College of Radiology: Reston, VA, 2013
6.
go back to reference Burnside ES, Sickles EA, Bassett LW, Rubin DL et al.: The ACR BI-RADS experience: learning from history. J Am Coll Radiol 6(12):851–860, Dec 2009CrossRef Burnside ES, Sickles EA, Bassett LW, Rubin DL et al.: The ACR BI-RADS experience: learning from history. J Am Coll Radiol 6(12):851–860, Dec 2009CrossRef
7.
go back to reference Cunningham H: GATE, a general architecture for text engineering. Computers and the Humanities 36:223–254, 2002CrossRef Cunningham H: GATE, a general architecture for text engineering. Computers and the Humanities 36:223–254, 2002CrossRef
8.
go back to reference Ramshaw L, Marcus M: Text Chunking Using Transformation-Based Learning. In Proceedings of the Third ACL Workshop on Very Large Corpora, 1995 Ramshaw L, Marcus M: Text Chunking Using Transformation-Based Learning. In Proceedings of the Third ACL Workshop on Very Large Corpora, 1995
10.
go back to reference Marwede D, Schulz T, Kahn T: Indexing thoracic CT reports using a preliminary version of a standardized radiological lexicon (RadLex). J Digit Imaging 21(4):363–370, 2008CrossRef Marwede D, Schulz T, Kahn T: Indexing thoracic CT reports using a preliminary version of a standardized radiological lexicon (RadLex). J Digit Imaging 21(4):363–370, 2008CrossRef
11.
go back to reference Hong Y, Zhang J, Heilbrun ME, Kahn CE: Analysis of RadLex Coverage and Term Co-occurrence in Radiology Reporting Templates. J Digit Imaging. 25(1):56–62, 2012CrossRef Hong Y, Zhang J, Heilbrun ME, Kahn CE: Analysis of RadLex Coverage and Term Co-occurrence in Radiology Reporting Templates. J Digit Imaging. 25(1):56–62, 2012CrossRef
12.
go back to reference Hazen R, Van Esbroeck AP, Mongkolwat P, Channin DS: Automatic Extraction of Concepts to Extend RadLex. J Digit Imaging 24(1):165–169, 2011CrossRef Hazen R, Van Esbroeck AP, Mongkolwat P, Channin DS: Automatic Extraction of Concepts to Extend RadLex. J Digit Imaging 24(1):165–169, 2011CrossRef
Metadata
Title
Proposing New RadLex Terms by Analyzing Free-Text Mammography Reports
Authors
Hakan Bulu
Dorothy A. Sippo
Janie M. Lee
Elizabeth S. Burnside
Daniel L. Rubin
Publication date
01-10-2018
Publisher
Springer International Publishing
Published in
Journal of Imaging Informatics in Medicine / Issue 5/2018
Print ISSN: 2948-2925
Electronic ISSN: 2948-2933
DOI
https://doi.org/10.1007/s10278-018-0064-0

Other articles of this Issue 5/2018

Journal of Digital Imaging 5/2018 Go to the issue