Top

BMC Medical Informatics and Decision Making

Published in:

Open Access 01-12-2021 | Research

Multi-task learning for Chinese clinical named entity recognition with external knowledge

Authors: Ming Cheng, Shufeng Xiong, Fei Li, Pan Liang, Jianbo Gao

Published in: BMC Medical Informatics and Decision Making | Issue 1/2021

Abstract

Background

Named entity recognition (NER) on Chinese electronic medical/healthcare records has attracted significantly attentions as it can be applied to building applications to understand these records. Most previous methods have been purely data-driven, requiring high-quality and large-scale labeled medical data. However, labeled data is expensive to obtain, and these data-driven methods are difficult to handle rare and unseen entities.

Methods

To tackle these problems, this study presents a novel multi-task deep neural network model for Chinese NER in the medical domain. We incorporate dictionary features into neural networks, and a general secondary named entity segmentation is used as auxiliary task to improve the performance of the primary task of named entity recognition.

Results

In order to evaluate the proposed method, we compare it with other currently popular methods, on three benchmark datasets. Two of the datasets are publicly available, and the other one is constructed by us. Experimental results show that the proposed model achieves 91.07% average f-measure on the two public datasets and 87.05% f-measure on private dataset.

Conclusions

The comparison results of different models demonstrated the effectiveness of our model. The proposed model outperformed traditional statistical models.

Lee W, Kim K, Lee EY, Choi J. Conditional random fields for clinical named entity recognition: a comparative study using Korean clinical texts. Comput Biol Med. 2018;101:7–14.CrossRef

Cheng M, Li L, Ren Y, Lou Y, Gao J. A hybrid method to extract clinical information from Chinese electronic medical records. IEEE Access. 2019;7:70624–33.CrossRef

Wu Y, Jiang M, Lei J, Xu H. Named entity recognition in Chinese clinical text using deep neural network. In: MEDINFO: eHealth-enabled Health—proceedings of the 15th world congress on health and biomedical informatics, São Paulo, Brazil. Studies in health technology and informatics, vol. 216; 2015. p. 624–8.

Lou Y, Zhang Y, Qian T, Li F, Xiong S, Ji D. A transition-based joint model for disease named entity recognition and normalization. Bioinformatics. 2017;33(15):2363–71.CrossRef

Zhang Z, Zhou T, Zhang Y, Pang Y. Attention-based deep residual learning network for entity relation extraction in Chinese emrs. BMC Med Inform Decis Mak. 2019;19(S2):171–7.CrossRef

Liu Z, Yang M, Wang X, Chen Q, Tang B, Wang Z, Xu H. Entity recognition from clinical texts via recurrent neural network. BMC Med Inform Decis Mak. 2017;17(2):53–61.

Giorgi JM, Bader GD. Transfer learning for biomedical named entity recognition with neural networks. Bioinformatics. 2018;34(23):4087–94.CrossRef

Sun Z, Sun XLX, Meng Y, Ao X, He Q, Wu F, Li J. Chinesebert: Chinese pretraining enhanced by glyph and pinyin information. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, ACL/IJCNLP (Volume 1: Long Papers); 2021. p. 2065–75.

Mu X, Wang W, Xu A. Incorporating token-level dictionary feature into neural model for named entity recognition. Neurocomputing. 2020;375:43–50.CrossRef

10.

Wang Q, Zhou Y, Ruan T, Gao D, Xia Y, He P. Incorporating dictionaries into deep neural networks for the Chinese clinical named entity recognition. J Biomed Inform. 2019;92:66.CrossRef

11.

Wu G, Tang G, Wang Z, Zhang Z, Wang Z. An attention-based bilstm-crf model for Chinese clinic named entity recognition. IEEE Access. 2019;7:113942–9.CrossRef

12.

Qin J., Zhou Q.W.T.R.Y., Gao J. Chinese clinical named entity recognition using residual dilated convolutional neural network with conditional random field. IEEE Trans Nanobiosci. 2019;18(3):306–15.CrossRef

13.

Chen L., Chen Y.F.R.D.H.J.B. Long short-term memory rnn for biomedical named entity recognition. Bioinformatics. 2017;18(1):462–71.PubMedPubMedCentral

14.

Ji B., Liu R., Li S., Yu J., Wu Q., Tan Y., Wu J. A hybrid approach for named entity recognition in Chinese electronic medical record. BMC Med Inform Decis Mak. 2019;19–S(2):149–58.

15.

Zeng QT, Goryachev S, Weiss ST, Sordo M, Murphy SN, Lazarus R. Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system. BMC Med Inform Decis Mak. 2006;6:30.CrossRef

16.

Sun W, Rumshisky A, Uzuner Ö. Evaluating temporal relations in clinical text: 2012 i2b2 challenge. J Am Med Inform Assoc. 2013;20(5):806–13.CrossRef

17.

Leaman R, Lu Z. Taggerone: joint named entity recognition and normalization with semi-Markov models. Bioinformatics. 2016;32(18):2839–46.CrossRef

18.

Curran JR, Clark S. Language independent NER using a maximum entropy tagger. In: Proceedings of the seventh conference on natural language learning, CoNLL, Edmonton, Canada; 2003. p. 164–7.

19.

McCallum A. Li W. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Proceedings of the seventh conference on natural language learning, CoNLL, Edmonton, Canada; 2003. p. 188–91.

20.

Klein D, Smarr J, Nguyen H, Manning CD. Named entity recognition with character-level models. In: Proceedings of the seventh conference on natural language learning, CoNLL, Edmonton, Canada; 2003. p. 180–3.

21.

Skeppstedt M, Kvist G.H.N.H.D.M. Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text. J Biomed Inform. 2014;49:148–58.CrossRef

22.

Song M, Yu H, Han W. Developing a hybrid dictionary-based bio-entity recognition technique. BMC Med Inform Decis Mak. 2015;15(S–1):9.CrossRef

23.

Tang B., Wang X., Yan J., Chen Q. Entity recognition in Chinese clinical text using attention-based CNN-LSTM-CRF. BMC Med Inform Decis Mak. 2019;19–S(3):89–97.

24.

Luo L, Yang Z, Yang P, Zhang Y, Wang L, Lin H, Wang J. An attention-based bilstm-crf approach to document-level chemical named entity recognition. Bioinformatics. 2018;34(8):1381–8.CrossRef

25.

Ma X, Hovy EH. End-to-end sequence labeling via bi-directional lstm-cnns-crf. In: Proceedings of the 54th annual meeting of the association for computational linguistics, ACL, Berlin, Germany; 2016.

26.

Khan MAAH, Dimitrova N, Shamsuzzaman M, Hasan SA, Sorower MS, Liu J, Datla VV, Milosevic M, Mankovich G, van Ommering R. Improving disease named entity recognition for clinical trial matching. In: IEEE international conference on bioinformatics and biomedicine, BIBM, San Diego, CA, USA; 2019. p. 2541–8.

27.

Sahu SK, Anand A. Recurrent neural network models for disease name recognition using domain invariant features. In: Proceedings of the 54th annual meeting of the association for computational linguistics, ACL, Berlin, Germany; 2016.

28.

Dong C, Zhang J, Zong C, Hattori M, Di H. Character-based LSTM-CRF with radical-level features for chinese named entity recognition. In: Natural language understanding and intelligent applications—5th CCF conference on natural language processing and chinese computing, NLPCC, and 24th international conference on computer processing of oriental languages, ICCPOL, Kunming, China. Lecture Notes in Computer Science, vol. 10102; 2016. p. 239–50.

29.

Zhao S, Liu T, Zhao S, Wang F. A neural multi-task learning framework to jointly model medical named entity recognition and normalization. In: The thirty-third AAAI conference on artificial intelligence, AAAI, Honolulu, Hawaii, USA; 2019. p. 817–24.

30.

Luong M, Le QV, Sutskever I, Vinyals O, Kaiser L. Multi-task sequence to sequence learning. In: 4th international conference on learning representations, ICLR, San Juan, Puerto Rico; 2016.

31.

Fei H, Ren Y, Ji D. Dispatched attention with multi-task learning for nested mention recognition. Inf Sci. 2020;513:241–51.CrossRef

32.

Wang X, Zhang Y, Ren X, Zhang Y, Zitnik M, Shang J, Langlotz C, Han J. Cross-type biomedical named entity recognition with deep multi-task learning. Bioinformatics. 2019;35(10):1745–52.CrossRef

33.

Li X, Zhang H, Zhou X. Chinese clinical named entity recognition with variant neural structures based on BERT methods. J Biomed Inform. 2020;107:103422.CrossRef

34.

Ren Y, Fei H, Liang X, Ji D, Cheng M. A hybrid neural network model for predicting kidney disease in hypertension patients based on electronic health records. BMC Med Inform Decis Mak. 2019;19–S(2):131–8.

35.

Cheng M., Zhao X., Ding X., Gao J., Xiong S., Ren Y. Prediction of blood culture outcome using hybrid neural network model based on electronic health records. BMC Med Inform Decis Mak. 2020;20–S(3):121.CrossRef

36.

Hu J, Shi X, Liu Z, Wang X, Chen Q, Tang B. Hitsz cner: a hybrid system for entity recognition from Chinese clinical text. In: Proceedings of CCKS 2017.

37.

Zhang Q, Li Z, Feng D, Li D, Huang Z, Peng Y. Multitask learning for chinese named entity recognition. In: Advances in multimedia information processing—PCM 2018—2019th Pacific-Rim conference on multimedia, Hefei, China. Lecture notes in computer science, vol. 11165; 2018. p. 653–62.

38.

Qiu J, Wang Q, Zhou Y, Ruan T, Gao J. Fast and accurate recognition of Chinese clinical named entities with residual dilated convolutions. In: IEEE international conference on bioinformatics and biomedicine, BIBM, Madrid, Spain; 2018. p. 935–42.

39.

Luo L, Li N, Li S, Yang Z, Lin H. Dutir at the ccks-2018 task1: a neural network ensemble approach for Chinese clinical named entity recognition. In: In: CEUR workshop proceedings, vol. 2242; 2018. p. 7–12.

40.

Yang X, Huang W. A conditional random fields approach to clinical name entity recognition. In: CEUR workshop proceedings, vol. 2242; 2018. p. 1–6.

41.

Aguilar G, Maharjan S, López-Monroy AP, Solorio T. A multi-task approach for named entity recognition in social media data. In: Proceedings of the 3rd workshop on noisy user-generated text, NUT@EMNLP, Copenhagen, Denmark; 2017. p. 148–53.

Title: Multi-task learning for Chinese clinical named entity recognition with external knowledge
Authors: Ming Cheng
Shufeng Xiong
Fei Li
Pan Liang
Jianbo Gao
Publication date: 01-12-2021
Publisher: BioMed Central
Published in: BMC Medical Informatics and Decision Making / Issue 1/2021
Electronic ISSN: 1472-6947
DOI: https://doi.org/10.1186/s12911-021-01717-1

At a glance: The STEP trials

Springer Medicine

Multi-task learning for Chinese clinical named entity recognition with external knowledge

Abstract

Background

Methods

Results

Conclusions

At a glance: The STEP trials

Springer Medicine

Abstract

Background

Methods

Results

Conclusions

Please log in to get access to this content

Other articles of this Issue 1/2021

Are health websites credible enough for elderly self-education in the most prevalent elderly diseases?

An ensemble-based feature selection framework to select risk factors of childhood obesity for policy decision making

Skills and key education needed for clinical librarians: an exploratory study from the librarians' perspectives

Implementation of an Electronic Medication Management System in a large tertiary hospital: a case of qualitative inquiry

U-Net combined with multi-scale attention mechanism for liver segmentation in CT images

Predictive modeling for 14-day unplanned hospital readmission risk by using machine learning algorithms