Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 2/2018

Open Access 01-07-2018 | Research

Chemical-induced disease extraction via recurrent piecewise convolutional neural networks

Authors: Haodi Li, Ming Yang, Qingcai Chen, Buzhou Tang, Xiaolong Wang, Jun Yan

Published in: BMC Medical Informatics and Decision Making | Special Issue 2/2018

Login to get access

Abstract

Background

Extracting relationships between chemicals and diseases from unstructured literature have attracted plenty of attention since the relationships are very useful for a large number of biomedical applications such as drug repositioning and pharmacovigilance. A number of machine learning methods have been proposed for chemical-induced disease (CID) extraction due to some publicly available annotated corpora. Most of them suffer from time-consuming feature engineering except deep learning methods. In this paper, we propose a novel document-level deep learning method, called recurrent piecewise convolutional neural networks (RPCNN), for CID extraction.

Results

Experimental results on a benchmark dataset, the CDR (Chemical-induced Disease Relation) dataset of the BioCreative V challenge for CID extraction show that the highest precision, recall and F-score of our RPCNN-based CID extraction system are 65.24, 77.21 and 70.77%, which is competitive with other state-of-the-art systems.

Conclusions

A novel deep learning method is proposed for document-level CID extraction, where domain knowledge, piecewise strategy, attention mechanism, and multi-instance learning are combined together. The effectiveness of the method is proved by experiments conducted on a benchmark dataset.
Literature
1.
go back to reference Kang N, Singh B, Bui C, Afzal Z, van Mulligen EM, Kors JA. Knowledge-based extraction of adverse drug events from biomedical text. BMC Bioinformatics. 2014;15(1):64.CrossRefPubMedPubMedCentral Kang N, Singh B, Bui C, Afzal Z, van Mulligen EM, Kors JA. Knowledge-based extraction of adverse drug events from biomedical text. BMC Bioinformatics. 2014;15(1):64.CrossRefPubMedPubMedCentral
2.
go back to reference Zhou D, Zhong D, He Y. Biomedical relation extraction: from binary to complex. Comput Math Methods Med. 2014. Zhou D, Zhong D, He Y. Biomedical relation extraction: from binary to complex. Comput Math Methods Med. 2014.
3.
go back to reference Chen ES, Hripcsak G, Xu H, Markatou M, Friedman C. Automated acquisition of disease–drug knowledge from biomedical and clinical documents: an initial study. J Am Med Inform Assoc. 2008;15(1):87–98.CrossRefPubMedPubMedCentral Chen ES, Hripcsak G, Xu H, Markatou M, Friedman C. Automated acquisition of disease–drug knowledge from biomedical and clinical documents: an initial study. J Am Med Inform Assoc. 2008;15(1):87–98.CrossRefPubMedPubMedCentral
4.
go back to reference Mao JJ, Chung A, Benton A, Hill S, Ungar L, Leonard CE, et al. Online discussion of drug side effects and discontinuation among breast cancer survivors. Pharmacoepidemiol Drug Saf. 2013;22(3):256–62.CrossRefPubMedPubMedCentral Mao JJ, Chung A, Benton A, Hill S, Ungar L, Leonard CE, et al. Online discussion of drug side effects and discontinuation among breast cancer survivors. Pharmacoepidemiol Drug Saf. 2013;22(3):256–62.CrossRefPubMedPubMedCentral
5.
go back to reference Khoo CS, Chan S, Niu Y. Extracting causal knowledge from a medical database using graphical patterns. In: Proceedings of the 38th annual meeting on Association for Computational Linguistics. Association for Computational Linguistics; 2000. p. 336–43. Khoo CS, Chan S, Niu Y. Extracting causal knowledge from a medical database using graphical patterns. In: Proceedings of the 38th annual meeting on Association for Computational Linguistics. Association for Computational Linguistics; 2000. p. 336–43.
6.
go back to reference Xu R, Wang Q. Automatic construction of a large-scale and accurate drug-side-effect association knowledge base from biomedical literature. J Biomed Inform. 2014;51:191–9.CrossRefPubMedPubMedCentral Xu R, Wang Q. Automatic construction of a large-scale and accurate drug-side-effect association knowledge base from biomedical literature. J Biomed Inform. 2014;51:191–9.CrossRefPubMedPubMedCentral
7.
go back to reference Li J, Sun Y, Johnson RJ, Sciaky D, Wei C-H, Leaman R, et al. BioCreative V CDR task corpus: a resource for chemical disease relation extraction. Database. 2016;2016:baw068. Li J, Sun Y, Johnson RJ, Sciaky D, Wei C-H, Leaman R, et al. BioCreative V CDR task corpus: a resource for chemical disease relation extraction. Database. 2016;2016:baw068.
8.
go back to reference Xu J, Wu Y, Zhang Y, Wang J, Lee H-J, Xu H. CD-REST: a system for extracting chemical-induced disease relation in literature. Database. 2016;2016:baw036.CrossRefPubMedPubMedCentral Xu J, Wu Y, Zhang Y, Wang J, Lee H-J, Xu H. CD-REST: a system for extracting chemical-induced disease relation in literature. Database. 2016;2016:baw036.CrossRefPubMedPubMedCentral
9.
go back to reference Zhou H, Deng H, Chen L, Yang Y, Jia C, Huang D. Exploiting syntactic and semantics information for chemical–disease relation extraction. Database J Biol Databases Curation. 2016; Zhou H, Deng H, Chen L, Yang Y, Jia C, Huang D. Exploiting syntactic and semantics information for chemical–disease relation extraction. Database J Biol Databases Curation. 2016;
10.
go back to reference Zhang X, Zhao J, LeCun Y. Character-level convolutional networks for text classification. Adv Neural Inf Proces Syst. 2015;1:649–57. Zhang X, Zhao J, LeCun Y. Character-level convolutional networks for text classification. Adv Neural Inf Proces Syst. 2015;1:649–57.
11.
go back to reference Liu P, Qiu X, Huang X.. Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101. 2016. Liu P, Qiu X, Huang X.. Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101. 2016.
12.
go back to reference Zeng D, Liu K, Chen Y, Zhao J. Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks, in Proceedings of EMNLP 2015, Lisbon, Portugal, September; 2015:17–21. Zeng D, Liu K, Chen Y, Zhao J. Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks, in Proceedings of EMNLP 2015, Lisbon, Portugal, September; 2015:17–21.
13.
go back to reference Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, et al. Attention-based bidirectional long short-term memory networks for relation classification. In: The 54th annual meeting of the Association for Computational Linguistics; 2016. Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, et al. Attention-based bidirectional long short-term memory networks for relation classification. In: The 54th annual meeting of the Association for Computational Linguistics; 2016.
14.
go back to reference H. Li, Q. Chen, B. Tang and X. Wang. “Chemical-induced disease extraction via convolutional neural networks with attention,” 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Kansas City, MO, USA, 2017. p. 1276–1279. H. Li, Q. Chen, B. Tang and X. Wang. “Chemical-induced disease extraction via convolutional neural networks with attention,” 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Kansas City, MO, USA, 2017. p. 1276–1279.
15.
go back to reference Gu et al. Chemical-induced disease relation extraction via convolutional neural network. Database (Oxford). 2017;2017:bax024. Gu et al. Chemical-induced disease relation extraction via convolutional neural network. Database (Oxford). 2017;2017:bax024.
16.
go back to reference Patrick Verga, Emma Strubell, Andrew McCallum. Simultaneously self-attending to all mentions for full-abstract biological relation extraction. Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT/NAACL). 2018. Patrick Verga, Emma Strubell, Andrew McCallum. Simultaneously self-attending to all mentions for full-abstract biological relation extraction. Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT/NAACL). 2018.
18.
go back to reference Davis AP, Grondin CJ, Johnson RJ, Sciaky D, King BL, McMorran R, et al. The comparative toxicogenomics database: update 2017. Nucleic Acids Res. 2017;45(D1):D972–8.CrossRefPubMed Davis AP, Grondin CJ, Johnson RJ, Sciaky D, King BL, McMorran R, et al. The comparative toxicogenomics database: update 2017. Nucleic Acids Res. 2017;45(D1):D972–8.CrossRefPubMed
19.
go back to reference Wei WQ, Cronin RM, H X, Lasko TA, Bastarache L, Denny JC. Development and evaluation of an ensemble resource linking medications to their indications. J Am Med Inform Assoc. 2013;20:954–61.CrossRefPubMedPubMedCentral Wei WQ, Cronin RM, H X, Lasko TA, Bastarache L, Denny JC. Development and evaluation of an ensemble resource linking medications to their indications. J Am Med Inform Assoc. 2013;20:954–61.CrossRefPubMedPubMedCentral
21.
go back to reference Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res. 2014;15(1):1929–58. Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res. 2014;15(1):1929–58.
Metadata
Title
Chemical-induced disease extraction via recurrent piecewise convolutional neural networks
Authors
Haodi Li
Ming Yang
Qingcai Chen
Buzhou Tang
Xiaolong Wang
Jun Yan
Publication date
01-07-2018
Publisher
BioMed Central
DOI
https://doi.org/10.1186/s12911-018-0629-3

Other articles of this Special Issue 2/2018

BMC Medical Informatics and Decision Making 2/2018 Go to the issue