Top

BMC Medical Informatics and Decision Making

Published in:

Open Access 01-09-2018 | Research

Comparison of MetaMap and cTAKES for entity extraction in clinical notes

Authors: Ruth Reátegui, Sylvie Ratté

Published in: BMC Medical Informatics and Decision Making | Special Issue 3/2018

Abstract

Background

Clinical notes such as discharge summaries have a semi- or unstructured format. These documents contain information about diseases, treatments, drugs, etc. Extracting meaningful information from them becomes challenging due to their narrative format. In this context, we aimed to compare the automatic extraction capacity of medical entities using two tools: MetaMap and cTAKES.

Methods

We worked with i2b2 (Informatics for Integrating Biology to the Bedside) Obesity Challenge data. Two experiments were constructed. In the first one, only one UMLS concept related with the diseases annotated was extracted. In the second, some UMLS concepts were aggregated.

Results

Results were evaluated with manually annotated medical entities. With the aggregation process the result shows a better improvement. MetaMap had an average of 0.88 in recall, 0.89 in precision, and 0.88 in F-score. With cTAKES, the average of recall, precision and F-score were 0.91, 0.89, and 0.89, respectively.

Conclusions

The aggregation of concepts (with similar and different semantic types) was shown to be a good strategy for improving the extraction of medical entities, and automatic aggregation could be considered in future works.

Roque FS, Jensen PB, Schmock H, Dalgaard M, Andreatta M, Hansen T, Brunak S. Using electronic patient records to discover disease correlations and stratify patient cohorts. PLoS Comput Biol. 2011;7(8):1–10.CrossRef

Lyalina S, Percha B, LePendu P, Iyer SV, Altman RB, Shah NH. Identifying phenotypic signatures of neuropsychiatric disorders from electronic medical records. JAMIA. 2013;20(e2):e297–305.PubMed

Alnazzawi N, Thompson P, Batista-Navarro R, Ananiadou S. Using text mining techniques to extract phenotypic information from the PhenoCHF corpus. BMC Med Inform Decis Mak. 2015;15:1–10.

Chiaramello E, Paglialonga A, Pinciroli F, Tognola G. Attempting to use MetaMap in clinical practice: a feasibility study on the identification of medical concepts from Italian clinical notes. Stud Health Technol Inform. 2016;228:28–32.PubMed

Pereira L, Rijo R, Silva C, Agostinho M. Using text mining to diagnose and classify epilepsy in children. In: 2013 IEEE 15th International Conference on e-Health Networking, Applications and Services (Healthcom 2013): 9–12 Oct. 2013; 2013:345–9.CrossRef

Savova GK, Masanz JJ, Ogren PV, Zheng J, Sohn S, Kipper-Schuler KC, Chute CG. Mayo clinical text analysis and knowledge extraction system (cTAKES): architecture, component evaluation and applications. J Am Med Inform Assoc. 2010;17(5):507–13.CrossRef

Pradhan S, Elhadad N, South BR, Martinez D, Christensen L, Vogel A, Suominen H, Chapman WW, Savova G. Evaluating the state of the art in disorder recognition and normalization of the clinical narrative. J Am Med Inform Assoc. 2015;22(1):143–54.CrossRef

Kovacevic A, Dehghan A, Filannino M, Keane JA, Nenadic G. Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives. J Am Med Inform Assoc. 2013;20(5):859–66.CrossRef

Aronson AR, Lang F-M. An overview of MetaMap: historical perspective and recent advances. JAMIA. 2010;17(3):229–36.PubMed

10.

Aronso A. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. AMIA Annu Symp Proc. 2001;2001:17–21.

11.

Becker M, Bockmann B. Extraction of UMLS (R) concepts using apache cTAKES (TM) for German language. Stud Health Technol. 2016;223:71–6.

12.

Yildirim P, Çeken Ç, Hassanpour R, Tolun MR. Prediction of similarities among rheumatic diseases. J Med Syst. 2012;36(3):1485–90.CrossRef

13.

Yıldırım P, Çeken Ç, Çeken K, Tolun M. Clustering analysis for vasculitic diseases. In: Zavoral F, Yaghob J, Pichappan P, El-Qawasmeh E, editors. Networked Digital Technologies, vol. 88: Springer, Berlin, Heidelberg; 2010:36–45.CrossRef

14.

Bejan CA, Xia F, Vanderwende L, Wurfel MM, Yetisgen-Yildiz M. Pneumonia identification using statistical feature selection. JAMIA. 2012;19(5):817–23.PubMed

15.

Uzuner Ö. Recognizing obesity and comorbidities in sparse data. JAMIA. 2009;16(4):561–70.PubMed

16.

UMLS® Reference Manual. http://www.ncbi.nlm.nih.gov/books/NBK9676/. Last accessed 30 June 2018.

17.

SNOMED CT. https://www.nlm.nih.gov/healthit/snomedct/snomed_overview.html. Last accessed 30 June 2018.

18.

RxNORM. https://www.nlm.nih.gov/research/umls/rxnorm/. Last accessed 30 June 2018.

19.

Hwang S. Comparison and evaluation of pathway-level aggregation methods of gene expression data. BMC Genomics. 2012;13:1–18.

20.

Tang B, Wu Y, Jiang M, Denny JC, Xu H. Recognizing and encoding discorder concepts in clinical text using machine learning and vector space model, Online working notes of the CLEF 2013 Evaluation Labs and Workshop; 2013:23–6.

21.

Jonnagaddala J, Jue TR, Chang NW, Dai HJ. Improving the dictionary lookup approach for disease normalization using enhanced dictionary and query expansion. Database (Oxford). 2016;2016:1–14.CrossRef

22.

Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG. A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform. 2001;34(5):301–10.CrossRef

Title: Comparison of MetaMap and cTAKES for entity extraction in clinical notes
Authors: Ruth Reátegui
Sylvie Ratté
Publication date: 01-09-2018
Publisher: BioMed Central
Published in: BMC Medical Informatics and Decision Making / Issue Special Issue 3/2018
Electronic ISSN: 1472-6947
DOI: https://doi.org/10.1186/s12911-018-0654-2

Keynote webinar | Spotlight on medication adherence

Springer Medicine

Comparison of MetaMap and cTAKES for entity extraction in clinical notes

Abstract

Background

Methods

Results

Conclusions

Keynote webinar | Spotlight on medication adherence

Springer Medicine

Abstract

Background

Methods

Results

Conclusions

Please log in to get access to this content