Skip to main content
Top
Published in: Journal of Digital Imaging 5/2015

01-10-2015

Evaluation of an Automated Information Extraction Tool for Imaging Data Elements to Populate a Breast Cancer Screening Registry

Authors: Ronilda Lacson, Kimberly Harris, Phyllis Brawarsky, Tor D. Tosteson, Tracy Onega, Anna N. A. Tosteson, Abby Kaye, Irina Gonzalez, Robyn Birdwell, Jennifer S. Haas

Published in: Journal of Imaging Informatics in Medicine | Issue 5/2015

Login to get access

Abstract

Breast cancer screening is central to early breast cancer detection. Identifying and monitoring process measures for screening is a focus of the National Cancer Institute’s Population-based Research Optimizing Screening through Personalized Regimens (PROSPR) initiative, which requires participating centers to report structured data across the cancer screening continuum. We evaluate the accuracy of automated information extraction of imaging findings from radiology reports, which are available as unstructured text. We present prevalence estimates of imaging findings for breast imaging received by women who obtained care in a primary care network participating in PROSPR (n = 139,953 radiology reports) and compared automatically extracted data elements to a “gold standard” based on manual review for a validation sample of 941 randomly selected radiology reports, including mammograms, digital breast tomosynthesis, ultrasound, and magnetic resonance imaging (MRI). The prevalence of imaging findings vary by data element and modality (e.g., suspicious calcification noted in 2.6 % of screening mammograms, 12.1 % of diagnostic mammograms, and 9.4 % of tomosynthesis exams). In the validation sample, the accuracy of identifying imaging findings, including suspicious calcifications, masses, and architectural distortion (on mammogram and tomosynthesis); masses, cysts, non-mass enhancement, and enhancing foci (on MRI); and masses and cysts (on ultrasound), range from 0.8 to1.0 for recall, precision, and F-measure. Information extraction tools can be used for accurate documentation of imaging findings as structured data elements from text reports for a variety of breast imaging modalities. These data can be used to populate screening registries to help elucidate more effective breast cancer screening processes.
Literature
1.
go back to reference Pace LE, He Y, Keating NL: Trends in mammography screening rates after publication of the 2009 US Preventive Services Task Force recommendations. Cancer 119(14):2518–2523, 2013CrossRefPubMed Pace LE, He Y, Keating NL: Trends in mammography screening rates after publication of the 2009 US Preventive Services Task Force recommendations. Cancer 119(14):2518–2523, 2013CrossRefPubMed
2.
go back to reference Smith-Bindman R, Miglioretti DL, Lurie N, et al: Does utilization of screening mammography explain racial and ethnic differences in breast cancer? Ann Intern Med 144(8):541–553, 2006CrossRefPubMed Smith-Bindman R, Miglioretti DL, Lurie N, et al: Does utilization of screening mammography explain racial and ethnic differences in breast cancer? Ann Intern Med 144(8):541–553, 2006CrossRefPubMed
3.
go back to reference Smigal C, Jemal A, Ward E, et al: Trends in breast cancer by race and ethnicity: update 2006. CA Cancer J Clin 56(3):168–183, 2006CrossRefPubMed Smigal C, Jemal A, Ward E, et al: Trends in breast cancer by race and ethnicity: update 2006. CA Cancer J Clin 56(3):168–183, 2006CrossRefPubMed
4.
go back to reference Esserman L, Shieh Y, Thompson I: Rethinking screening for breast cancer and prostate cancer. JAMA 302(15):1685–1692, 2009CrossRefPubMed Esserman L, Shieh Y, Thompson I: Rethinking screening for breast cancer and prostate cancer. JAMA 302(15):1685–1692, 2009CrossRefPubMed
5.
go back to reference Sorlie T, Perou CM, Tibshirani R, et al: Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A 98(19):10869–10874, 2001PubMedCentralCrossRefPubMed Sorlie T, Perou CM, Tibshirani R, et al: Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A 98(19):10869–10874, 2001PubMedCentralCrossRefPubMed
6.
go back to reference Yang WT, Dryden M, Broglio K, et al: Mammographic features of triple receptor-negative primary breast cancers in young premenopausal women. Breast Cancer Res Treat 111(3):405–410, 2008CrossRefPubMed Yang WT, Dryden M, Broglio K, et al: Mammographic features of triple receptor-negative primary breast cancers in young premenopausal women. Breast Cancer Res Treat 111(3):405–410, 2008CrossRefPubMed
7.
go back to reference Atlas SJ, Ashburner JM, Chang Y, et al: Population-based breast cancer screening in a primary care network. Am J Manag Care 18(12):821–829, 2012PubMedCentralPubMed Atlas SJ, Ashburner JM, Chang Y, et al: Population-based breast cancer screening in a primary care network. Am J Manag Care 18(12):821–829, 2012PubMedCentralPubMed
8.
go back to reference Lester WT, Ashburner JM, Grant RW, et al: Mammography FastTrack: an intervention to facilitate reminders for breast cancer screening across a heterogeneous multi-clinic primary care network. J Am Med Inform Assoc 16(2):187–195, 2009PubMedCentralCrossRefPubMed Lester WT, Ashburner JM, Grant RW, et al: Mammography FastTrack: an intervention to facilitate reminders for breast cancer screening across a heterogeneous multi-clinic primary care network. J Am Med Inform Assoc 16(2):187–195, 2009PubMedCentralCrossRefPubMed
9.
go back to reference Buckley JM, Coopey SB, Sharko J, et al: The feasibility of using natural language processing to extract clinical information from breast pathology reports. J Pathol Inform 3:23, 2012PubMedCentralCrossRefPubMed Buckley JM, Coopey SB, Sharko J, et al: The feasibility of using natural language processing to extract clinical information from breast pathology reports. J Pathol Inform 3:23, 2012PubMedCentralCrossRefPubMed
10.
go back to reference Xu H, Fu Z, Shah A, et al: Extracting and integrating data from entire electronic health records for detecting colorectal cancer cases. AMIA Annu Symp Proc 2011:1564–1572, 2011PubMedCentralPubMed Xu H, Fu Z, Shah A, et al: Extracting and integrating data from entire electronic health records for detecting colorectal cancer cases. AMIA Annu Symp Proc 2011:1564–1572, 2011PubMedCentralPubMed
11.
go back to reference Harkema H, Chapman WW, Saul M, et al: Developing a natural language processing application for measuring the quality of colonoscopy procedures. J Am Med Inform Assoc 18(Suppl 1):i150–i156, 2011PubMedCentralCrossRefPubMed Harkema H, Chapman WW, Saul M, et al: Developing a natural language processing application for measuring the quality of colonoscopy procedures. J Am Med Inform Assoc 18(Suppl 1):i150–i156, 2011PubMedCentralCrossRefPubMed
12.
13.
go back to reference Currie AM, Fricke T, Gawne A et al: Automated extraction of free-text from pathology reports. AMIA Annu Symp Proc. 899, 2006 Currie AM, Fricke T, Gawne A et al: Automated extraction of free-text from pathology reports. AMIA Annu Symp Proc. 899, 2006
14.
go back to reference Sippo DA, Warden GI, Andriole KP, et al: Automated extraction of BI-RADS final assessment categories from radiology reports with natural language processing. J Digit Imaging 26(5):989–994, 2013PubMedCentralCrossRefPubMed Sippo DA, Warden GI, Andriole KP, et al: Automated extraction of BI-RADS final assessment categories from radiology reports with natural language processing. J Digit Imaging 26(5):989–994, 2013PubMedCentralCrossRefPubMed
15.
go back to reference Percha B, Nassif H, Lipson J, et al: Automatic classification of mammography reports by BI-RADS breast tissue composition class. J Am Med Inform Assoc 19(5):913–916, 2012PubMedCentralCrossRefPubMed Percha B, Nassif H, Lipson J, et al: Automatic classification of mammography reports by BI-RADS breast tissue composition class. J Am Med Inform Assoc 19(5):913–916, 2012PubMedCentralCrossRefPubMed
16.
go back to reference Onega T, Smith M, Miglioretti DL, et al: Radiologist agreement for mammographic recall by case difficulty and finding type. J Am Coll Radiol 9(11):788–794, 2012PubMedCentralCrossRefPubMed Onega T, Smith M, Miglioretti DL, et al: Radiologist agreement for mammographic recall by case difficulty and finding type. J Am Coll Radiol 9(11):788–794, 2012PubMedCentralCrossRefPubMed
17.
go back to reference D’Orsi CJ, Sickles EA, Mendelson EB, Morris EA: ACR BI-RADS Atlas, Breast Imaging Reporting and Data System (BI-RADS). American College of Radiology, 5th ed, 2013 D’Orsi CJ, Sickles EA, Mendelson EB, Morris EA: ACR BI-RADS Atlas, Breast Imaging Reporting and Data System (BI-RADS). American College of Radiology, 5th ed, 2013
18.
go back to reference Siegal E, Angelakis E, Morris P, Pinkus E: Breast molecular imaging: a retrospective review of one institutions experience with this modality and analysis of its potential role in breast imaging decision making. Breast J 18(2):111–117, 2012CrossRefPubMed Siegal E, Angelakis E, Morris P, Pinkus E: Breast molecular imaging: a retrospective review of one institutions experience with this modality and analysis of its potential role in breast imaging decision making. Breast J 18(2):111–117, 2012CrossRefPubMed
19.
go back to reference Feig SA: Role and evaluation of mammography and other imaging methods for breast cancer detection, diagnosis, and staging. Semin Nucl Med 29(1):3–15, 1999CrossRefPubMed Feig SA: Role and evaluation of mammography and other imaging methods for breast cancer detection, diagnosis, and staging. Semin Nucl Med 29(1):3–15, 1999CrossRefPubMed
20.
go back to reference Anders CK, Hsu DS, Broadwater G, et al: Young age at diagnosis correlates with worse prognosis and defines a subset of breast cancers with shared patterns of gene expression. J Clin Oncol 26(20):3324–3330, 2008CrossRefPubMed Anders CK, Hsu DS, Broadwater G, et al: Young age at diagnosis correlates with worse prognosis and defines a subset of breast cancers with shared patterns of gene expression. J Clin Oncol 26(20):3324–3330, 2008CrossRefPubMed
21.
go back to reference Birdwell RL, Ikeda DM, O’Shaughnessy KF, Sickles EA: Mammographic characteristics of 115 missed cancers later detected with screening mammography and the potential utility of computer-aided detection. Radiology 219(1):192–202, 2001CrossRefPubMed Birdwell RL, Ikeda DM, O’Shaughnessy KF, Sickles EA: Mammographic characteristics of 115 missed cancers later detected with screening mammography and the potential utility of computer-aided detection. Radiology 219(1):192–202, 2001CrossRefPubMed
22.
go back to reference Goergen SK, Evans J, Cohen GP, MacMillan JH: Characteristics of breast carcinomas missed by screening radiologists. Radiology 204(1):131–135, 1997CrossRefPubMed Goergen SK, Evans J, Cohen GP, MacMillan JH: Characteristics of breast carcinomas missed by screening radiologists. Radiology 204(1):131–135, 1997CrossRefPubMed
23.
go back to reference Bullier B, MacGrogan G, Bonnefoi H, et al: Imaging features of sporadic breast cancer in women under 40 years old: 97 cases. Eur Radiol 23(12):3237–3245, 2013CrossRefPubMed Bullier B, MacGrogan G, Bonnefoi H, et al: Imaging features of sporadic breast cancer in women under 40 years old: 97 cases. Eur Radiol 23(12):3237–3245, 2013CrossRefPubMed
24.
go back to reference Mendez A, Cabanillas F, Echenique M, et al: Mammographic features and correlation with biopsy findings using 11-gauge stereotactic vacuum-assisted breast biopsy (SVABB). Ann Oncol 15(3):450–454, 2004CrossRefPubMed Mendez A, Cabanillas F, Echenique M, et al: Mammographic features and correlation with biopsy findings using 11-gauge stereotactic vacuum-assisted breast biopsy (SVABB). Ann Oncol 15(3):450–454, 2004CrossRefPubMed
25.
go back to reference Tamaki K, Ishida T, Miyashita M, et al: Correlation between mammographic findings and corresponding histopathology: potential predictors for biological characteristics of breast diseases. Cancer Sci 102(12):2179–2185, 2011CrossRefPubMed Tamaki K, Ishida T, Miyashita M, et al: Correlation between mammographic findings and corresponding histopathology: potential predictors for biological characteristics of breast diseases. Cancer Sci 102(12):2179–2185, 2011CrossRefPubMed
26.
go back to reference Muller-Schimpfle M, Wersebe A, Xydeas T, et al: Microcalcifications of the breast: how does radiologic classification correlate with histology? Acta Radiol 46(8):774–781, 2005CrossRefPubMed Muller-Schimpfle M, Wersebe A, Xydeas T, et al: Microcalcifications of the breast: how does radiologic classification correlate with histology? Acta Radiol 46(8):774–781, 2005CrossRefPubMed
27.
go back to reference Ballard-Barbash R, Taplin SH, Yankaskas BC, et al: Breast Cancer Surveillance Consortium: a national mammography screening and outcomes database. AJR Am J Roentgenol 169(4):1001–1008, 1997CrossRefPubMed Ballard-Barbash R, Taplin SH, Yankaskas BC, et al: Breast Cancer Surveillance Consortium: a national mammography screening and outcomes database. AJR Am J Roentgenol 169(4):1001–1008, 1997CrossRefPubMed
28.
go back to reference de Coronado S, Haber MW, Sioutos N, et al: NCI Thesaurus: using science-based terminology to integrate cancer research results. Stud Health Technol Inform 107(Pt 1):33–37, 2004PubMed de Coronado S, Haber MW, Sioutos N, et al: NCI Thesaurus: using science-based terminology to integrate cancer research results. Stud Health Technol Inform 107(Pt 1):33–37, 2004PubMed
29.
go back to reference Langlotz CP: RadLex: a new method for indexing online educational materials. Radiographics 26(6):1595–1597, 2006CrossRefPubMed Langlotz CP: RadLex: a new method for indexing online educational materials. Radiographics 26(6):1595–1597, 2006CrossRefPubMed
31.
go back to reference Liu H, Wu ST, Li D, et al: Towards a semantic lexicon for clinical natural language processing. AMIA Annu Symp Proc 2012:568–576, 2012PubMedCentralPubMed Liu H, Wu ST, Li D, et al: Towards a semantic lexicon for clinical natural language processing. AMIA Annu Symp Proc 2012:568–576, 2012PubMedCentralPubMed
33.
go back to reference Information from Searching Content with an Ontology-Utilizing Toolkit. sourceforge.net/projects/iscout. 8-8-2012. Last accessed 11-20-2014 Information from Searching Content with an Ontology-Utilizing Toolkit. sourceforge.net/projects/iscout. 8-8-2012. Last accessed 11-20-2014
34.
go back to reference Lacson R, Andriole KP, Prevedello LM, Khorasani R: Information from Searching Content with an Ontology-Utilizing Toolkit (iSCOUT). J Digit Imaging, 2012 Lacson R, Andriole KP, Prevedello LM, Khorasani R: Information from Searching Content with an Ontology-Utilizing Toolkit (iSCOUT). J Digit Imaging, 2012
35.
go back to reference Chapman WW, Bridewell W, Hanbury P, et al: A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform 34(5):301–310, 2001CrossRefPubMed Chapman WW, Bridewell W, Hanbury P, et al: A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform 34(5):301–310, 2001CrossRefPubMed
36.
go back to reference Sickles EA: Auditing your breast imaging practice: an evidence-based approach. Semin Roentgenol 42(4):211–217, 2007CrossRefPubMed Sickles EA: Auditing your breast imaging practice: an evidence-based approach. Semin Roentgenol 42(4):211–217, 2007CrossRefPubMed
37.
go back to reference Landis JR, Koch GG: The measurement of observer agreement for categorical data. Biometrics 33(1):159–174, 1977CrossRefPubMed Landis JR, Koch GG: The measurement of observer agreement for categorical data. Biometrics 33(1):159–174, 1977CrossRefPubMed
38.
go back to reference Hersh W: Evaluation of biomedical text-mining systems: lessons learned from information retrieval. Brief Bioinform 6(4):344–356, 2005CrossRefPubMed Hersh W: Evaluation of biomedical text-mining systems: lessons learned from information retrieval. Brief Bioinform 6(4):344–356, 2005CrossRefPubMed
40.
go back to reference Hayes Jr, H, Vandergrift J, Diner WC: Mammography and breast implants. Plast Reconstr Surg 82(1):1–8, 1988CrossRefPubMed Hayes Jr, H, Vandergrift J, Diner WC: Mammography and breast implants. Plast Reconstr Surg 82(1):1–8, 1988CrossRefPubMed
41.
go back to reference Gumucio CA, Pin P, Young VL, et al: The effect of breast implants on the radiographic detection of microcalcification and soft-tissue masses. Plast Reconstr Surg 84(5):772–778, 1989CrossRefPubMed Gumucio CA, Pin P, Young VL, et al: The effect of breast implants on the radiographic detection of microcalcification and soft-tissue masses. Plast Reconstr Surg 84(5):772–778, 1989CrossRefPubMed
Metadata
Title
Evaluation of an Automated Information Extraction Tool for Imaging Data Elements to Populate a Breast Cancer Screening Registry
Authors
Ronilda Lacson
Kimberly Harris
Phyllis Brawarsky
Tor D. Tosteson
Tracy Onega
Anna N. A. Tosteson
Abby Kaye
Irina Gonzalez
Robyn Birdwell
Jennifer S. Haas
Publication date
01-10-2015
Publisher
Springer US
Published in
Journal of Imaging Informatics in Medicine / Issue 5/2015
Print ISSN: 2948-2925
Electronic ISSN: 2948-2933
DOI
https://doi.org/10.1007/s10278-014-9762-4

Other articles of this Issue 5/2015

Journal of Digital Imaging 5/2015 Go to the issue