Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2020

Open Access 01-12-2020 | Research article

A system for automatically extracting clinical events with temporal information

Authors: Zhijing Li, Chen Li, Yu Long, Xuan Wang

Published in: BMC Medical Informatics and Decision Making | Issue 1/2020

Login to get access

Abstract

Background

The popularization of health and medical informatics yields huge amounts of data. Extracting clinical events on a temporal course is the foundation of enabling advanced applications and research. It is a structure of presenting information in chronological order. Manual extraction would be extremely challenging due to the quantity and complexity of the records.

Methods

We present an recurrent neural network- based architecture, which is able to automatically extract clinical event expressions along with each event’s temporal information. The system is built upon the attention-based and recursive neural networks and introduce a piecewise representation (we divide the input sentences into three pieces to better utilize the information in the sentences), incorporates semantic information by utilizing word representations obtained from BioASQ and Wikipedia.

Results

The system is evaluated on the THYME corpus, a set of manually annotated clinical records from Mayo Clinic. In order to further verify the effectiveness of the system, the system is also evaluated on the TimeBank _Dense corpus. The experiments demonstrate that the system outperforms the current state-of-the-art models. The system also supports domain adaptation, i.e., the system may be used in brain cancer data while its model is trained in colon cancer data.

Conclusion

Our system extracts temporal expressions, event expressions and link them according to actually occurring sequence, which may structure the key information from complicated unstructured clinical records. Furthermore, we demonstrate that combining the piecewise representation method with attention mechanism can capture more complete features. The system is flexible and can be extended to handle other document types.
Literature
1.
go back to reference Morgan A, Mooney S, Aronow B, Brenner S. Precision medicine: data and discovery for improved health and therapy. In: Pacific symposium; 2016. p. 243–8. Morgan A, Mooney S, Aronow B, Brenner S. Precision medicine: data and discovery for improved health and therapy. In: Pacific symposium; 2016. p. 243–8.
2.
go back to reference Will Styler, Guergana Savova, Martha Palmer, James Pustejovsky et al. THYME annotation guidelines. 2012. Will Styler, Guergana Savova, Martha Palmer, James Pustejovsky et al. THYME annotation guidelines. 2012.
3.
go back to reference Velupillai S, Mowery D, Abdelrahman S, Christensen L, Chapman W. BluLab: temporal information extraction for the 2015 clinical TempEval Challenge. In: International workshop on semantic evaluation; 2015. p. 815–9. Velupillai S, Mowery D, Abdelrahman S, Christensen L, Chapman W. BluLab: temporal information extraction for the 2015 clinical TempEval Challenge. In: International workshop on semantic evaluation; 2015. p. 815–9.
4.
go back to reference MacAvaney S, Cohan A, Goharian N. GUIR at SemEval-2017 Task 12: a framework for cross-domain clinical temporal information extraction. In: International workshop on semantic evaluation; 2017. p. 1024–9. MacAvaney S, Cohan A, Goharian N. GUIR at SemEval-2017 Task 12: a framework for cross-domain clinical temporal information extraction. In: International workshop on semantic evaluation; 2017. p. 1024–9.
5.
go back to reference Tourille J, Ferret O, Tannier X, Névéol A. LIMSI-COT at SemEval-2017 Task 12: neural architecture for temporal information extraction from clinical narratives. In: International workshop on semantic evaluation; 2017. p. 597–602. Tourille J, Ferret O, Tannier X, Névéol A. LIMSI-COT at SemEval-2017 Task 12: neural architecture for temporal information extraction from clinical narratives. In: International workshop on semantic evaluation; 2017. p. 597–602.
6.
go back to reference Lina C, Millera TA, Dligachb D, Amiria H, Bethardc S, Savova G. Self-training improves recurrent neural networks performance for temporal relation extraction. In: Proceedings of the 9th international workshop on health text mining and information analysis; 2018. p. 165–76.CrossRef Lina C, Millera TA, Dligachb D, Amiria H, Bethardc S, Savova G. Self-training improves recurrent neural networks performance for temporal relation extraction. In: Proceedings of the 9th international workshop on health text mining and information analysis; 2018. p. 165–76.CrossRef
7.
go back to reference Wang Y, Yang Z, Lin H, Li Y. A syntactic rule-based method for automatic pathway information extraction from biomedical literature. In: IEEE international conference on bioinformatics & biomedicine workshops; 2012. p. 626–33.CrossRef Wang Y, Yang Z, Lin H, Li Y. A syntactic rule-based method for automatic pathway information extraction from biomedical literature. In: IEEE international conference on bioinformatics & biomedicine workshops; 2012. p. 626–33.CrossRef
8.
go back to reference Socher R, Huval B, et al. Semantic compositionality through recursive matrix-vector spaces. In: Joint conference on empirical methods in natural language processing & computational natural language learning; 2012. p. 1201–11. Socher R, Huval B, et al. Semantic compositionality through recursive matrix-vector spaces. In: Joint conference on empirical methods in natural language processing & computational natural language learning; 2012. p. 1201–11.
9.
go back to reference Luo Y. Recurrent neural networks for classifying relations in clinical notes. J Biomed Inform. 2017;72:85–95.CrossRef Luo Y. Recurrent neural networks for classifying relations in clinical notes. J Biomed Inform. 2017;72:85–95.CrossRef
10.
go back to reference Zeng D, Liu K, Lai S, Zhou G, Zhao J. Relation classification via convolutional deep neural network. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers; 2014. p. 2335–44. Zeng D, Liu K, Lai S, Zhou G, Zhao J. Relation classification via convolutional deep neural network. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers; 2014. p. 2335–44.
11.
go back to reference Xu Y, Mou L, Li G, Chen Y, Peng H, Jin Z. Classifying relations via long short term memory networks along shortest dependency paths. In: Conference on empirical methods in natural language processing; 2015. p. 1785–94. Xu Y, Mou L, Li G, Chen Y, Peng H, Jin Z. Classifying relations via long short term memory networks along shortest dependency paths. In: Conference on empirical methods in natural language processing; 2015. p. 1785–94.
12.
go back to reference dos Santos CN, Xiang B, Zhou B. Classifying relations by ranking with convolutional neural networks. arXiv preprint arXiv:1504.06580. 2015. dos Santos CN, Xiang B, Zhou B. Classifying relations by ranking with convolutional neural networks. arXiv preprint arXiv:1504.06580. 2015.
13.
go back to reference Nguyen T-V, Moschitti A, Riccardi G. Convolution kernels on constituent, dependency and sequential structures for relation extraction. In: Conference on empirical methods in natural language processing; 2009. p. 1378–87. Nguyen T-V, Moschitti A, Riccardi G. Convolution kernels on constituent, dependency and sequential structures for relation extraction. In: Conference on empirical methods in natural language processing; 2009. p. 1378–87.
14.
go back to reference Xu K, Feng Y, Huang S, Zhao D. Semantic relation classification via convolutional neural networks with simple negative sampling. Comp Sci. 2015;71(7):941–9. Xu K, Feng Y, Huang S, Zhao D. Semantic relation classification via convolutional neural networks with simple negative sampling. Comp Sci. 2015;71(7):941–9.
15.
go back to reference Zhou D, Miao L, He Y. Position-aware deep multi-task learning for drug–drug interaction extraction. Artif Intell Med. 2018;87:1–8. Zhou D, Miao L, He Y. Position-aware deep multi-task learning for drug–drug interaction extraction. Artif Intell Med. 2018;87:1–8.
17.
go back to reference Zhou H, Liu Z, Ning S, Yang Y, Lang C, Lin Y, Ma K. Leveraging prior knowledge for protein–protein interaction extraction with memory network. Database. 2018. Zhou H, Liu Z, Ning S, Yang Y, Lang C, Lin Y, Ma K. Leveraging prior knowledge for protein–protein interaction extraction with memory network. Database. 2018.
18.
go back to reference Liu L, Li B-C, et al. Named entity relation extraction based on SVM training by positive and negative cases. J Comput Appl. 2008;28(6):1444–37. Liu L, Li B-C, et al. Named entity relation extraction based on SVM training by positive and negative cases. J Comput Appl. 2008;28(6):1444–37.
19.
go back to reference Rosenfeld B, Feldman R. Conditional random fields (crf)-based relation extraction system: U.S. Patent Application 12/852,678. 2011. Rosenfeld B, Feldman R. Conditional random fields (crf)-based relation extraction system: U.S. Patent Application 12/852,678. 2011.
20.
go back to reference Li L, Nie Y, Han W, Huang J. A multi-attention-based bidirectional long short-term memory network for relation extraction. In: International conference on neural information processing; 2017. p. 216–27.CrossRef Li L, Nie Y, Han W, Huang J. A multi-attention-based bidirectional long short-term memory network for relation extraction. In: International conference on neural information processing; 2017. p. 216–27.CrossRef
21.
go back to reference Zhou P, et al. Attention-based bidirectional long short-term memory networks for relation classification. In: Meeting of the association for computational linguistics; 2016. p. 207–12. Zhou P, et al. Attention-based bidirectional long short-term memory networks for relation classification. In: Meeting of the association for computational linguistics; 2016. p. 207–12.
22.
go back to reference Manning CD, Surdeanu M, et al. The Stanford CoreNLP natural language processing toolkit. In: Proceedings of the 52nd annual meeting of the association for computational linguistics: system demonstrations; 2014. p. 55–60.CrossRef Manning CD, Surdeanu M, et al. The Stanford CoreNLP natural language processing toolkit. In: Proceedings of the 52nd annual meeting of the association for computational linguistics: system demonstrations; 2014. p. 55–60.CrossRef
23.
go back to reference Ebrahimi J. Chain based RNN for relation classification. In: Conference of the North American Chapter of the Association for computational linguistics: human language technologies; 2010. p. 1244–9. Ebrahimi J. Chain based RNN for relation classification. In: Conference of the North American Chapter of the Association for computational linguistics: human language technologies; 2010. p. 1244–9.
24.
go back to reference Mesnil G, He X, Deng L, Bengio Y. Investigation of recurrent-neural network architectures and learning methods for spoken language understanding. In: INTERSPEECH; 2013. p. 3771–5. Mesnil G, He X, Deng L, Bengio Y. Investigation of recurrent-neural network architectures and learning methods for spoken language understanding. In: INTERSPEECH; 2013. p. 3771–5.
25.
go back to reference Zhang D, Wang D. Relation classification via recurrent neural network. arXiv preprint arXiv:1508.01006. 2015. Zhang D, Wang D. Relation classification via recurrent neural network. arXiv preprint arXiv:1508.01006. 2015.
26.
go back to reference Kai Sheng Tai, Richard Socher. Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv, 2015.CrossRef Kai Sheng Tai, Richard Socher. Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv, 2015.CrossRef
27.
go back to reference Pascanu R, Mikolov T, Bengio Y. On the difficulty of training recurrent neural networks. Int Conf Int Mach Learn. 2013;52(3):1301–10. Pascanu R, Mikolov T, Bengio Y. On the difficulty of training recurrent neural networks. Int Conf Int Mach Learn. 2013;52(3):1301–10.
28.
go back to reference Bastien F, Lamblin P, Pascanu R, Bergstra J, et al. Theano: new features and speed improvements. In: Deep learning and unsupervised feature learning NIPS 2012 workshop; 2012. Bastien F, Lamblin P, Pascanu R, Bergstra J, et al. Theano: new features and speed improvements. In: Deep learning and unsupervised feature learning NIPS 2012 workshop; 2012.
29.
go back to reference Li L, Jin L, Huang D. Exploring recurrent neural networks to detect named entities from biomedical text. In: Chinese computational linguistics and natural language processing based on naturally annotated big data; 2015. p. 279–90.CrossRef Li L, Jin L, Huang D. Exploring recurrent neural networks to detect named entities from biomedical text. In: Chinese computational linguistics and natural language processing based on naturally annotated big data; 2015. p. 279–90.CrossRef
30.
go back to reference Kambhatla N. Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations. In: Acl on interactive poster & demonstration sessions; 2004. p. 22–5. Kambhatla N. Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations. In: Acl on interactive poster & demonstration sessions; 2004. p. 22–5.
31.
go back to reference Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space; 2013. p. 1301–3781. arXiv preprint arXiv. Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space; 2013. p. 1301–3781. arXiv preprint arXiv.
32.
go back to reference Lu Z. PubMed and beyond: a survey of web tools for searching biomedical literature. Database. 2011;2011:baq036.CrossRef Lu Z. PubMed and beyond: a survey of web tools for searching biomedical literature. Database. 2011;2011:baq036.CrossRef
33.
go back to reference Pavlopoulos I, Kosmopoulos A, Androutsopoulos I. Continuous space word vectors obtained by applying Word2Vec to abstracts of biomedical articles; 2014. Pavlopoulos I, Kosmopoulos A, Androutsopoulos I. Continuous space word vectors obtained by applying Word2Vec to abstracts of biomedical articles; 2014.
34.
go back to reference Mikolov T, Sutskever I, Chen K, Dean J. Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS; 2013. Mikolov T, Sutskever I, Chen K, Dean J. Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS; 2013.
35.
go back to reference Mikolov T, Yih W-t, Zweig G. Linguistic regularities in continuous space word representations. In: Proceedings of NAACL HLT; 2013. Mikolov T, Yih W-t, Zweig G. Linguistic regularities in continuous space word representations. In: Proceedings of NAACL HLT; 2013.
36.
go back to reference Elman Jeffrey L. Finding structure in time. Cognitive science, 1990; 14(2), 179–211.CrossRef Elman Jeffrey L. Finding structure in time. Cognitive science, 1990; 14(2), 179–211.CrossRef
37.
go back to reference Fries JA. Brundlefly at SemEval-2016 Task 12: recurrent neural networks vs. joint inference for clinical temporal information extraction. In: Proceedings of SemEval-2016; 2016. p. 1274–9. Fries JA. Brundlefly at SemEval-2016 Task 12: recurrent neural networks vs. joint inference for clinical temporal information extraction. In: Proceedings of SemEval-2016; 2016. p. 1274–9.
38.
go back to reference Zhou HW, Liu Z, Ning SX, Yang YL, Lang CK, Lin YY, Ma K. Leveraging prior knowledge for protein-protein interaction extraction with memory network. Database. 2018. p. 1–13. Zhou HW, Liu Z, Ning SX, Yang YL, Lang CK, Lin YY, Ma K. Leveraging prior knowledge for protein-protein interaction extraction with memory network. Database. 2018. p. 1–13.
39.
go back to reference Bethard S, Savova G, Chen W-T, Derczynski L, Pustejovsky J, Verhagen M. SemEval-2016 Task 12: clinical TempEval. In: Proceedings of SemEval-2016; 2016. p. 1052–62. Bethard S, Savova G, Chen W-T, Derczynski L, Pustejovsky J, Verhagen M. SemEval-2016 Task 12: clinical TempEval. In: Proceedings of SemEval-2016; 2016. p. 1052–62.
40.
go back to reference Styler W, Bethard S, Finan S, Palmer M, Pradhan S, et al. Temporal annotation in the clinical domain. In: Transactions of the association for computational linguistics; 2014. p. 143–54. Styler W, Bethard S, Finan S, Palmer M, Pradhan S, et al. Temporal annotation in the clinical domain. In: Transactions of the association for computational linguistics; 2014. p. 143–54.
41.
go back to reference Li C, Rao ZQ, Zhang XR. LitWay, discriminative extraction for different bio-events. In: Bionlp shared task workshop; 2016. p. 32–41.CrossRef Li C, Rao ZQ, Zhang XR. LitWay, discriminative extraction for different bio-events. In: Bionlp shared task workshop; 2016. p. 32–41.CrossRef
42.
go back to reference UzZaman N, Allen JF. Temporal evaluation. In: Proceedings of the 49th annual meeting of the association for computational linguistics; 2011. p. 351–6. UzZaman N, Allen JF. Temporal evaluation. In: Proceedings of the 49th annual meeting of the association for computational linguistics; 2011. p. 351–6.
43.
go back to reference Bethard S. ClearTK-TimeML: a minimalist approach to tempeval 2013. In: Second joint conference on lexical and computational semantics (*SEM), volume 2: proceedings of the seventh international workshop on semantic evaluation (SemEval 2013). Atlanta: Association for Computational Linguistics; 2013. p. 10–4. Bethard S. ClearTK-TimeML: a minimalist approach to tempeval 2013. In: Second joint conference on lexical and computational semantics (*SEM), volume 2: proceedings of the seventh international workshop on semantic evaluation (SemEval 2013). Atlanta: Association for Computational Linguistics; 2013. p. 10–4.
44.
go back to reference Chambers N, Cassidy T, McDowell B, Bethard S. Dense event ordering with a multi-pass architecture. Trans Assoc Comput Linguist. 2014;2:273–84.CrossRef Chambers N, Cassidy T, McDowell B, Bethard S. Dense event ordering with a multi-pass architecture. Trans Assoc Comput Linguist. 2014;2:273–84.CrossRef
Metadata
Title
A system for automatically extracting clinical events with temporal information
Authors
Zhijing Li
Chen Li
Yu Long
Xuan Wang
Publication date
01-12-2020
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2020
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/s12911-020-01208-9

Other articles of this Issue 1/2020

BMC Medical Informatics and Decision Making 1/2020 Go to the issue