Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2012

Open Access 01-12-2012 | Proceedings

Detecting modification of biomedical events using a deep parsing approach

Authors: Andrew MacKinlay, David Martinez, Timothy Baldwin

Published in: BMC Medical Informatics and Decision Making | Special Issue 1/2012

Login to get access

Abstract

Background

This work describes a system for identifying event mentions in bio-molecular research abstracts that are either speculative (e.g. analysis of IkappaBalpha phosphorylation, where it is not specified whether phosphorylation did or did not occur) or negated (e.g. inhibition of IkappaBalpha phosphorylation, where phosphorylation did not occur). The data comes from a standard dataset created for the BioNLP 2009 Shared Task. The system uses a machine-learning approach, where the features used for classification are a combination of shallow features derived from the words of the sentences and more complex features based on the semantic outputs produced by a deep parser.

Method

To detect event modification, we use a Maximum Entropy learner with features extracted from the data relative to the trigger words of the events. The shallow features are bag-of-words features based on a small sliding context window of 3-4 tokens on either side of the trigger word. The deep parser features are derived from parses produced by the English Resource Grammar and the RASP parser. The outputs of these parsers are converted into the Minimal Recursion Semantics formalism, and from this, we extract features motivated by linguistics and the data itself. All of these features are combined to create training or test data for the machine learning algorithm.

Results

Over the test data, our methods produce approximately a 4% absolute increase in F-score for detection of event modification compared to a baseline based only on the shallow bag-of-words features.

Conclusions

Our results indicate that grammar-based techniques can enhance the accuracy of methods for detecting event modification.
Literature
1.
go back to reference Kim JD, Ohta T, Pyysalo S, Kano Y, Tsujii J: Overview of BioNLP'09 shared task on event extraction. Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task. 2009, Boulder, USA, 1-9. Kim JD, Ohta T, Pyysalo S, Kano Y, Tsujii J: Overview of BioNLP'09 shared task on event extraction. Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task. 2009, Boulder, USA, 1-9.
2.
go back to reference Medlock B: Exploring hedge identification in biomedical literature. J Biomed Inform. 2008, 41: 636-654. 10.1016/j.jbi.2008.01.001.CrossRefPubMed Medlock B: Exploring hedge identification in biomedical literature. J Biomed Inform. 2008, 41: 636-654. 10.1016/j.jbi.2008.01.001.CrossRefPubMed
3.
go back to reference Solt I, Tikk D, Gal V, Kardkovacs ZT: Semantic classification of diseases in discharge summaries using a context-aware rule-based classifier. J Am Med Inform Assoc. 2009, 16: 580-584. 10.1197/jamia.M3087.PubMedCentralCrossRefPubMed Solt I, Tikk D, Gal V, Kardkovacs ZT: Semantic classification of diseases in discharge summaries using a context-aware rule-based classifier. J Am Med Inform Assoc. 2009, 16: 580-584. 10.1197/jamia.M3087.PubMedCentralCrossRefPubMed
4.
go back to reference Vincze V, Szarvas G, Farkas R, Móra G, Csirik J: The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes. BMC Bioinformatics. 2008, 9 (Suppl 11): S9-10.1186/1471-2105-9-S11-S9.PubMedCentralCrossRefPubMed Vincze V, Szarvas G, Farkas R, Móra G, Csirik J: The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes. BMC Bioinformatics. 2008, 9 (Suppl 11): S9-10.1186/1471-2105-9-S11-S9.PubMedCentralCrossRefPubMed
5.
go back to reference Björne J, Heimonen J, Ginter F, Airola A, Pahikkala T, Salakoski T: Extracting complex biological events with rich graph-based feature sets. Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task. 2009, Boulder, USA, 10-18. Björne J, Heimonen J, Ginter F, Airola A, Pahikkala T, Salakoski T: Extracting complex biological events with rich graph-based feature sets. Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task. 2009, Boulder, USA, 10-18.
6.
go back to reference Buyko E, Faessler E, Wermter J, Hahn U: Event extraction from trimmed dependency graphs. Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task. 2009, Boulder, USA, 19-27. Buyko E, Faessler E, Wermter J, Hahn U: Event extraction from trimmed dependency graphs. Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task. 2009, Boulder, USA, 19-27.
9.
go back to reference Kilicoglu H, Bergler S: Syntactic dependency based heuristics for biological event extraction. Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task. 2009, Boulder, USA, 119-127. Kilicoglu H, Bergler S: Syntactic dependency based heuristics for biological event extraction. Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task. 2009, Boulder, USA, 119-127.
10.
go back to reference Cohen KB, Verspoor K, Johnson H, Roeder C, Ogren P, Baumgartner W, White E, Hunter L: High-precision biological event extraction with a concept recognizer. Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task. 2009, Boulder, USA, 50-58. Cohen KB, Verspoor K, Johnson H, Roeder C, Ogren P, Baumgartner W, White E, Hunter L: High-precision biological event extraction with a concept recognizer. Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task. 2009, Boulder, USA, 50-58.
11.
go back to reference Hakenberg J, Solt I, Tikk D, Tari L, Rhein länder A, Quang Long N, Gonzalez G, Leser U: Molecular event extraction from Link Grammar parse trees. Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task. 2009, Boulder, USA, 86-94. Hakenberg J, Solt I, Tikk D, Tari L, Rhein länder A, Quang Long N, Gonzalez G, Leser U: Molecular event extraction from Link Grammar parse trees. Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task. 2009, Boulder, USA, 86-94.
12.
go back to reference Van Landeghem S, Saeys Y, De Baets B, Van de Peer Y: Analyzing text in search of bio-molecular events: a high-precision machine learning framework. Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task. 2009, Boulder, USA, 128-136. Van Landeghem S, Saeys Y, De Baets B, Van de Peer Y: Analyzing text in search of bio-molecular events: a high-precision machine learning framework. Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task. 2009, Boulder, USA, 128-136.
13.
go back to reference Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG: A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform. 2001, 34 (5): 301-310. 10.1006/jbin.2001.1029.CrossRefPubMed Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG: A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform. 2001, 34 (5): 301-310. 10.1006/jbin.2001.1029.CrossRefPubMed
14.
go back to reference Farkas R, Vincze V, Móra G, Csirik J, Szarvas G: The CoNLL-2010 shared task: learning to detect hedges and their scope in natural language text. Proceedings of the Fourteenth Conference on Computational Natural Language Learning. 2010, Uppsala, Sweden: Association for Computational Linguistics, 1-12. Farkas R, Vincze V, Móra G, Csirik J, Szarvas G: The CoNLL-2010 shared task: learning to detect hedges and their scope in natural language text. Proceedings of the Fourteenth Conference on Computational Natural Language Learning. 2010, Uppsala, Sweden: Association for Computational Linguistics, 1-12.
15.
go back to reference Velldal E, Øvrelid L, Oepen S: Resolving S PECULATION : MaxEnt cue classification and dependency-based scope rules. Proceedings of the Fourteenth Conference on Computational Natural Language Learning. 2010, Uppsala, Sweden: Association for Computational Linguistics, 48-55. Velldal E, Øvrelid L, Oepen S: Resolving S PECULATION : MaxEnt cue classification and dependency-based scope rules. Proceedings of the Fourteenth Conference on Computational Natural Language Learning. 2010, Uppsala, Sweden: Association for Computational Linguistics, 48-55.
16.
go back to reference Uszkoreit H: New chances for deep linguistic processing. Proceedings of the 19th International Conference on Computational Linguistics (COLING 2002). 2002, Taipei, Taiwan Uszkoreit H: New chances for deep linguistic processing. Proceedings of the 19th International Conference on Computational Linguistics (COLING 2002). 2002, Taipei, Taiwan
17.
go back to reference Copestake A, Flickinger D: An open source grammar development environment and broad-coverage English grammar using HPSG. Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC 2000). 2000, Athens, Greece Copestake A, Flickinger D: An open source grammar development environment and broad-coverage English grammar using HPSG. Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC 2000). 2000, Athens, Greece
18.
go back to reference Flickinger D: On building a more efficient grammar by exploiting types. Collaborative Language Engineering. Edited by: Oepen S, Flickinger D, Tsujii J, Uszkoreit H. 2002, Stanford, USA: CSLI Publications Flickinger D: On building a more efficient grammar by exploiting types. Collaborative Language Engineering. Edited by: Oepen S, Flickinger D, Tsujii J, Uszkoreit H. 2002, Stanford, USA: CSLI Publications
19.
go back to reference Pollard C, Sag IA: Head-Driven Phrase Structure Grammar. 1994, Chicago, USA: University of Chicago Press Pollard C, Sag IA: Head-Driven Phrase Structure Grammar. 1994, Chicago, USA: University of Chicago Press
20.
go back to reference Callmeier U: PET - a platform for experimentation with efficient HPSG processing techniques. Natural Language Engineering. 2000, 6: 99-107. 10.1017/S1351324900002369.CrossRef Callmeier U: PET - a platform for experimentation with efficient HPSG processing techniques. Natural Language Engineering. 2000, 6: 99-107. 10.1017/S1351324900002369.CrossRef
21.
go back to reference Tsuruoka Y, Tateishi Y, Kim JD, Ohta T, McNaught J, Ananiadou S, Tsujii J: Developing a robust part-of-speech tagger for biomedical text. Advances in Informatics - 10th Panhellenic Conference on Informatics. 2005, Volas, Greece, 382-392. Tsuruoka Y, Tateishi Y, Kim JD, Ohta T, McNaught J, Ananiadou S, Tsujii J: Developing a robust part-of-speech tagger for biomedical text. Advances in Informatics - 10th Panhellenic Conference on Informatics. 2005, Volas, Greece, 382-392.
22.
go back to reference Adolphs P, Oepen S, Callmeier U, Crysmann B, Flickinger D, Kiefer B: Some fine points of hybrid natural language parsing. Proceedings of the Sixth International Language Resources and Evaluation (LREC'08). 2008, Marrakech, Morocco Adolphs P, Oepen S, Callmeier U, Crysmann B, Flickinger D, Kiefer B: Some fine points of hybrid natural language parsing. Proceedings of the Sixth International Language Resources and Evaluation (LREC'08). 2008, Marrakech, Morocco
23.
go back to reference Copestake A, Flickinger D, Sag IA, Pollard C: Minimal recursion semantics: an introduction. Research on Language and Computation. 2005, 281-332. Copestake A, Flickinger D, Sag IA, Pollard C: Minimal recursion semantics: an introduction. Research on Language and Computation. 2005, 281-332.
24.
go back to reference Copestake A: Report on the design of RMRS. Tech Rep D1.1a. 2004, University of Cambridge, Cambridge, UK Copestake A: Report on the design of RMRS. Tech Rep D1.1a. 2004, University of Cambridge, Cambridge, UK
25.
go back to reference Briscoe E, Carroll J, Watson R: The second release of the RASP system. Proceedings of the COLING/ACL 2006 Interactive Poster Session. 2006, Sydney, Australia, 77-80. Briscoe E, Carroll J, Watson R: The second release of the RASP system. Proceedings of the COLING/ACL 2006 Interactive Poster Session. 2006, Sydney, Australia, 77-80.
26.
go back to reference Frank A: Constraint-based RMRS construction from shallow grammars. COLING '04: Proceedings of the 20th International Conference on Computational Linguistics. 2004, 1269-1272.CrossRef Frank A: Constraint-based RMRS construction from shallow grammars. COLING '04: Proceedings of the 20th International Conference on Computational Linguistics. 2004, 1269-1272.CrossRef
Metadata
Title
Detecting modification of biomedical events using a deep parsing approach
Authors
Andrew MacKinlay
David Martinez
Timothy Baldwin
Publication date
01-12-2012
Publisher
BioMed Central
DOI
https://doi.org/10.1186/1472-6947-12-S1-S4

Other articles of this Special Issue 1/2012

BMC Medical Informatics and Decision Making 1/2012 Go to the issue