Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2021

Open Access 01-12-2021 | Research

Multi-task learning for Chinese clinical named entity recognition with external knowledge

Authors: Ming Cheng, Shufeng Xiong, Fei Li, Pan Liang, Jianbo Gao

Published in: BMC Medical Informatics and Decision Making | Issue 1/2021

Login to get access

Abstract

Background

Named entity recognition (NER) on Chinese electronic medical/healthcare records has attracted significantly attentions as it can be applied to building applications to understand these records. Most previous methods have been purely data-driven, requiring high-quality and large-scale labeled medical data. However, labeled data is expensive to obtain, and these data-driven methods are difficult to handle rare and unseen entities.

Methods

To tackle these problems, this study presents a novel multi-task deep neural network model for Chinese NER in the medical domain. We incorporate dictionary features into neural networks, and a general secondary named entity segmentation is used as auxiliary task to improve the performance of the primary task of named entity recognition.

Results

In order to evaluate the proposed method, we compare it with other currently popular methods, on three benchmark datasets. Two of the datasets are publicly available, and the other one is constructed by us. Experimental results show that the proposed model achieves 91.07% average f-measure on the two public datasets and 87.05% f-measure on private dataset.

Conclusions

The comparison results of different models demonstrated the effectiveness of our model. The proposed model outperformed traditional statistical models.
Literature
1.
go back to reference Lee W, Kim K, Lee EY, Choi J. Conditional random fields for clinical named entity recognition: a comparative study using Korean clinical texts. Comput Biol Med. 2018;101:7–14.CrossRef Lee W, Kim K, Lee EY, Choi J. Conditional random fields for clinical named entity recognition: a comparative study using Korean clinical texts. Comput Biol Med. 2018;101:7–14.CrossRef
2.
go back to reference Cheng M, Li L, Ren Y, Lou Y, Gao J. A hybrid method to extract clinical information from Chinese electronic medical records. IEEE Access. 2019;7:70624–33.CrossRef Cheng M, Li L, Ren Y, Lou Y, Gao J. A hybrid method to extract clinical information from Chinese electronic medical records. IEEE Access. 2019;7:70624–33.CrossRef
3.
go back to reference Wu Y, Jiang M, Lei J, Xu H. Named entity recognition in Chinese clinical text using deep neural network. In: MEDINFO: eHealth-enabled Health—proceedings of the 15th world congress on health and biomedical informatics, São Paulo, Brazil. Studies in health technology and informatics, vol. 216; 2015. p. 624–8. Wu Y, Jiang M, Lei J, Xu H. Named entity recognition in Chinese clinical text using deep neural network. In: MEDINFO: eHealth-enabled Health—proceedings of the 15th world congress on health and biomedical informatics, São Paulo, Brazil. Studies in health technology and informatics, vol. 216; 2015. p. 624–8.
4.
go back to reference Lou Y, Zhang Y, Qian T, Li F, Xiong S, Ji D. A transition-based joint model for disease named entity recognition and normalization. Bioinformatics. 2017;33(15):2363–71.CrossRef Lou Y, Zhang Y, Qian T, Li F, Xiong S, Ji D. A transition-based joint model for disease named entity recognition and normalization. Bioinformatics. 2017;33(15):2363–71.CrossRef
5.
go back to reference Zhang Z, Zhou T, Zhang Y, Pang Y. Attention-based deep residual learning network for entity relation extraction in Chinese emrs. BMC Med Inform Decis Mak. 2019;19(S2):171–7.CrossRef Zhang Z, Zhou T, Zhang Y, Pang Y. Attention-based deep residual learning network for entity relation extraction in Chinese emrs. BMC Med Inform Decis Mak. 2019;19(S2):171–7.CrossRef
6.
go back to reference Liu Z, Yang M, Wang X, Chen Q, Tang B, Wang Z, Xu H. Entity recognition from clinical texts via recurrent neural network. BMC Med Inform Decis Mak. 2017;17(2):53–61. Liu Z, Yang M, Wang X, Chen Q, Tang B, Wang Z, Xu H. Entity recognition from clinical texts via recurrent neural network. BMC Med Inform Decis Mak. 2017;17(2):53–61.
7.
go back to reference Giorgi JM, Bader GD. Transfer learning for biomedical named entity recognition with neural networks. Bioinformatics. 2018;34(23):4087–94.CrossRef Giorgi JM, Bader GD. Transfer learning for biomedical named entity recognition with neural networks. Bioinformatics. 2018;34(23):4087–94.CrossRef
8.
go back to reference Sun Z, Sun XLX, Meng Y, Ao X, He Q, Wu F, Li J. Chinesebert: Chinese pretraining enhanced by glyph and pinyin information. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, ACL/IJCNLP (Volume 1: Long Papers); 2021. p. 2065–75. Sun Z, Sun XLX, Meng Y, Ao X, He Q, Wu F, Li J. Chinesebert: Chinese pretraining enhanced by glyph and pinyin information. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, ACL/IJCNLP (Volume 1: Long Papers); 2021. p. 2065–75.
9.
go back to reference Mu X, Wang W, Xu A. Incorporating token-level dictionary feature into neural model for named entity recognition. Neurocomputing. 2020;375:43–50.CrossRef Mu X, Wang W, Xu A. Incorporating token-level dictionary feature into neural model for named entity recognition. Neurocomputing. 2020;375:43–50.CrossRef
10.
go back to reference Wang Q, Zhou Y, Ruan T, Gao D, Xia Y, He P. Incorporating dictionaries into deep neural networks for the Chinese clinical named entity recognition. J Biomed Inform. 2019;92:66.CrossRef Wang Q, Zhou Y, Ruan T, Gao D, Xia Y, He P. Incorporating dictionaries into deep neural networks for the Chinese clinical named entity recognition. J Biomed Inform. 2019;92:66.CrossRef
11.
go back to reference Wu G, Tang G, Wang Z, Zhang Z, Wang Z. An attention-based bilstm-crf model for Chinese clinic named entity recognition. IEEE Access. 2019;7:113942–9.CrossRef Wu G, Tang G, Wang Z, Zhang Z, Wang Z. An attention-based bilstm-crf model for Chinese clinic named entity recognition. IEEE Access. 2019;7:113942–9.CrossRef
12.
go back to reference Qin J., Zhou Q.W.T.R.Y., Gao J. Chinese clinical named entity recognition using residual dilated convolutional neural network with conditional random field. IEEE Trans Nanobiosci. 2019;18(3):306–15.CrossRef Qin J., Zhou Q.W.T.R.Y., Gao J. Chinese clinical named entity recognition using residual dilated convolutional neural network with conditional random field. IEEE Trans Nanobiosci. 2019;18(3):306–15.CrossRef
13.
go back to reference Chen L., Chen Y.F.R.D.H.J.B. Long short-term memory rnn for biomedical named entity recognition. Bioinformatics. 2017;18(1):462–71.PubMedPubMedCentral Chen L., Chen Y.F.R.D.H.J.B. Long short-term memory rnn for biomedical named entity recognition. Bioinformatics. 2017;18(1):462–71.PubMedPubMedCentral
14.
go back to reference Ji B., Liu R., Li S., Yu J., Wu Q., Tan Y., Wu J. A hybrid approach for named entity recognition in Chinese electronic medical record. BMC Med Inform Decis Mak. 2019;19–S(2):149–58. Ji B., Liu R., Li S., Yu J., Wu Q., Tan Y., Wu J. A hybrid approach for named entity recognition in Chinese electronic medical record. BMC Med Inform Decis Mak. 2019;19–S(2):149–58.
15.
go back to reference Zeng QT, Goryachev S, Weiss ST, Sordo M, Murphy SN, Lazarus R. Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system. BMC Med Inform Decis Mak. 2006;6:30.CrossRef Zeng QT, Goryachev S, Weiss ST, Sordo M, Murphy SN, Lazarus R. Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system. BMC Med Inform Decis Mak. 2006;6:30.CrossRef
16.
go back to reference Sun W, Rumshisky A, Uzuner Ö. Evaluating temporal relations in clinical text: 2012 i2b2 challenge. J Am Med Inform Assoc. 2013;20(5):806–13.CrossRef Sun W, Rumshisky A, Uzuner Ö. Evaluating temporal relations in clinical text: 2012 i2b2 challenge. J Am Med Inform Assoc. 2013;20(5):806–13.CrossRef
17.
go back to reference Leaman R, Lu Z. Taggerone: joint named entity recognition and normalization with semi-Markov models. Bioinformatics. 2016;32(18):2839–46.CrossRef Leaman R, Lu Z. Taggerone: joint named entity recognition and normalization with semi-Markov models. Bioinformatics. 2016;32(18):2839–46.CrossRef
18.
go back to reference Curran JR, Clark S. Language independent NER using a maximum entropy tagger. In: Proceedings of the seventh conference on natural language learning, CoNLL, Edmonton, Canada; 2003. p. 164–7. Curran JR, Clark S. Language independent NER using a maximum entropy tagger. In: Proceedings of the seventh conference on natural language learning, CoNLL, Edmonton, Canada; 2003. p. 164–7.
19.
go back to reference McCallum A. Li W. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Proceedings of the seventh conference on natural language learning, CoNLL, Edmonton, Canada; 2003. p. 188–91. McCallum A. Li W. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Proceedings of the seventh conference on natural language learning, CoNLL, Edmonton, Canada; 2003. p. 188–91.
20.
go back to reference Klein D, Smarr J, Nguyen H, Manning CD. Named entity recognition with character-level models. In: Proceedings of the seventh conference on natural language learning, CoNLL, Edmonton, Canada; 2003. p. 180–3. Klein D, Smarr J, Nguyen H, Manning CD. Named entity recognition with character-level models. In: Proceedings of the seventh conference on natural language learning, CoNLL, Edmonton, Canada; 2003. p. 180–3.
21.
go back to reference Skeppstedt M, Kvist G.H.N.H.D.M. Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text. J Biomed Inform. 2014;49:148–58.CrossRef Skeppstedt M, Kvist G.H.N.H.D.M. Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text. J Biomed Inform. 2014;49:148–58.CrossRef
22.
go back to reference Song M, Yu H, Han W. Developing a hybrid dictionary-based bio-entity recognition technique. BMC Med Inform Decis Mak. 2015;15(S–1):9.CrossRef Song M, Yu H, Han W. Developing a hybrid dictionary-based bio-entity recognition technique. BMC Med Inform Decis Mak. 2015;15(S–1):9.CrossRef
23.
go back to reference Tang B., Wang X., Yan J., Chen Q. Entity recognition in Chinese clinical text using attention-based CNN-LSTM-CRF. BMC Med Inform Decis Mak. 2019;19–S(3):89–97. Tang B., Wang X., Yan J., Chen Q. Entity recognition in Chinese clinical text using attention-based CNN-LSTM-CRF. BMC Med Inform Decis Mak. 2019;19–S(3):89–97.
24.
go back to reference Luo L, Yang Z, Yang P, Zhang Y, Wang L, Lin H, Wang J. An attention-based bilstm-crf approach to document-level chemical named entity recognition. Bioinformatics. 2018;34(8):1381–8.CrossRef Luo L, Yang Z, Yang P, Zhang Y, Wang L, Lin H, Wang J. An attention-based bilstm-crf approach to document-level chemical named entity recognition. Bioinformatics. 2018;34(8):1381–8.CrossRef
25.
go back to reference Ma X, Hovy EH. End-to-end sequence labeling via bi-directional lstm-cnns-crf. In: Proceedings of the 54th annual meeting of the association for computational linguistics, ACL, Berlin, Germany; 2016. Ma X, Hovy EH. End-to-end sequence labeling via bi-directional lstm-cnns-crf. In: Proceedings of the 54th annual meeting of the association for computational linguistics, ACL, Berlin, Germany; 2016.
26.
go back to reference Khan MAAH, Dimitrova N, Shamsuzzaman M, Hasan SA, Sorower MS, Liu J, Datla VV, Milosevic M, Mankovich G, van Ommering R. Improving disease named entity recognition for clinical trial matching. In: IEEE international conference on bioinformatics and biomedicine, BIBM, San Diego, CA, USA; 2019. p. 2541–8. Khan MAAH, Dimitrova N, Shamsuzzaman M, Hasan SA, Sorower MS, Liu J, Datla VV, Milosevic M, Mankovich G, van Ommering R. Improving disease named entity recognition for clinical trial matching. In: IEEE international conference on bioinformatics and biomedicine, BIBM, San Diego, CA, USA; 2019. p. 2541–8.
27.
go back to reference Sahu SK, Anand A. Recurrent neural network models for disease name recognition using domain invariant features. In: Proceedings of the 54th annual meeting of the association for computational linguistics, ACL, Berlin, Germany; 2016. Sahu SK, Anand A. Recurrent neural network models for disease name recognition using domain invariant features. In: Proceedings of the 54th annual meeting of the association for computational linguistics, ACL, Berlin, Germany; 2016.
28.
go back to reference Dong C, Zhang J, Zong C, Hattori M, Di H. Character-based LSTM-CRF with radical-level features for chinese named entity recognition. In: Natural language understanding and intelligent applications—5th CCF conference on natural language processing and chinese computing, NLPCC, and 24th international conference on computer processing of oriental languages, ICCPOL, Kunming, China. Lecture Notes in Computer Science, vol. 10102; 2016. p. 239–50. Dong C, Zhang J, Zong C, Hattori M, Di H. Character-based LSTM-CRF with radical-level features for chinese named entity recognition. In: Natural language understanding and intelligent applications—5th CCF conference on natural language processing and chinese computing, NLPCC, and 24th international conference on computer processing of oriental languages, ICCPOL, Kunming, China. Lecture Notes in Computer Science, vol. 10102; 2016. p. 239–50.
29.
go back to reference Zhao S, Liu T, Zhao S, Wang F. A neural multi-task learning framework to jointly model medical named entity recognition and normalization. In: The thirty-third AAAI conference on artificial intelligence, AAAI, Honolulu, Hawaii, USA; 2019. p. 817–24. Zhao S, Liu T, Zhao S, Wang F. A neural multi-task learning framework to jointly model medical named entity recognition and normalization. In: The thirty-third AAAI conference on artificial intelligence, AAAI, Honolulu, Hawaii, USA; 2019. p. 817–24.
30.
go back to reference Luong M, Le QV, Sutskever I, Vinyals O, Kaiser L. Multi-task sequence to sequence learning. In: 4th international conference on learning representations, ICLR, San Juan, Puerto Rico; 2016. Luong M, Le QV, Sutskever I, Vinyals O, Kaiser L. Multi-task sequence to sequence learning. In: 4th international conference on learning representations, ICLR, San Juan, Puerto Rico; 2016.
31.
go back to reference Fei H, Ren Y, Ji D. Dispatched attention with multi-task learning for nested mention recognition. Inf Sci. 2020;513:241–51.CrossRef Fei H, Ren Y, Ji D. Dispatched attention with multi-task learning for nested mention recognition. Inf Sci. 2020;513:241–51.CrossRef
32.
go back to reference Wang X, Zhang Y, Ren X, Zhang Y, Zitnik M, Shang J, Langlotz C, Han J. Cross-type biomedical named entity recognition with deep multi-task learning. Bioinformatics. 2019;35(10):1745–52.CrossRef Wang X, Zhang Y, Ren X, Zhang Y, Zitnik M, Shang J, Langlotz C, Han J. Cross-type biomedical named entity recognition with deep multi-task learning. Bioinformatics. 2019;35(10):1745–52.CrossRef
33.
go back to reference Li X, Zhang H, Zhou X. Chinese clinical named entity recognition with variant neural structures based on BERT methods. J Biomed Inform. 2020;107:103422.CrossRef Li X, Zhang H, Zhou X. Chinese clinical named entity recognition with variant neural structures based on BERT methods. J Biomed Inform. 2020;107:103422.CrossRef
34.
go back to reference Ren Y, Fei H, Liang X, Ji D, Cheng M. A hybrid neural network model for predicting kidney disease in hypertension patients based on electronic health records. BMC Med Inform Decis Mak. 2019;19–S(2):131–8. Ren Y, Fei H, Liang X, Ji D, Cheng M. A hybrid neural network model for predicting kidney disease in hypertension patients based on electronic health records. BMC Med Inform Decis Mak. 2019;19–S(2):131–8.
35.
go back to reference Cheng M., Zhao X., Ding X., Gao J., Xiong S., Ren Y. Prediction of blood culture outcome using hybrid neural network model based on electronic health records. BMC Med Inform Decis Mak. 2020;20–S(3):121.CrossRef Cheng M., Zhao X., Ding X., Gao J., Xiong S., Ren Y. Prediction of blood culture outcome using hybrid neural network model based on electronic health records. BMC Med Inform Decis Mak. 2020;20–S(3):121.CrossRef
36.
go back to reference Hu J, Shi X, Liu Z, Wang X, Chen Q, Tang B. Hitsz cner: a hybrid system for entity recognition from Chinese clinical text. In: Proceedings of CCKS 2017. Hu J, Shi X, Liu Z, Wang X, Chen Q, Tang B. Hitsz cner: a hybrid system for entity recognition from Chinese clinical text. In: Proceedings of CCKS 2017.
37.
go back to reference Zhang Q, Li Z, Feng D, Li D, Huang Z, Peng Y. Multitask learning for chinese named entity recognition. In: Advances in multimedia information processing—PCM 2018—2019th Pacific-Rim conference on multimedia, Hefei, China. Lecture notes in computer science, vol. 11165; 2018. p. 653–62. Zhang Q, Li Z, Feng D, Li D, Huang Z, Peng Y. Multitask learning for chinese named entity recognition. In: Advances in multimedia information processing—PCM 2018—2019th Pacific-Rim conference on multimedia, Hefei, China. Lecture notes in computer science, vol. 11165; 2018. p. 653–62.
38.
go back to reference Qiu J, Wang Q, Zhou Y, Ruan T, Gao J. Fast and accurate recognition of Chinese clinical named entities with residual dilated convolutions. In: IEEE international conference on bioinformatics and biomedicine, BIBM, Madrid, Spain; 2018. p. 935–42. Qiu J, Wang Q, Zhou Y, Ruan T, Gao J. Fast and accurate recognition of Chinese clinical named entities with residual dilated convolutions. In: IEEE international conference on bioinformatics and biomedicine, BIBM, Madrid, Spain; 2018. p. 935–42.
39.
go back to reference Luo L, Li N, Li S, Yang Z, Lin H. Dutir at the ccks-2018 task1: a neural network ensemble approach for Chinese clinical named entity recognition. In: In: CEUR workshop proceedings, vol. 2242; 2018. p. 7–12. Luo L, Li N, Li S, Yang Z, Lin H. Dutir at the ccks-2018 task1: a neural network ensemble approach for Chinese clinical named entity recognition. In: In: CEUR workshop proceedings, vol. 2242; 2018. p. 7–12.
40.
go back to reference Yang X, Huang W. A conditional random fields approach to clinical name entity recognition. In: CEUR workshop proceedings, vol. 2242; 2018. p. 1–6. Yang X, Huang W. A conditional random fields approach to clinical name entity recognition. In: CEUR workshop proceedings, vol. 2242; 2018. p. 1–6.
41.
go back to reference Aguilar G, Maharjan S, López-Monroy AP, Solorio T. A multi-task approach for named entity recognition in social media data. In: Proceedings of the 3rd workshop on noisy user-generated text, NUT@EMNLP, Copenhagen, Denmark; 2017. p. 148–53. Aguilar G, Maharjan S, López-Monroy AP, Solorio T. A multi-task approach for named entity recognition in social media data. In: Proceedings of the 3rd workshop on noisy user-generated text, NUT@EMNLP, Copenhagen, Denmark; 2017. p. 148–53.
Metadata
Title
Multi-task learning for Chinese clinical named entity recognition with external knowledge
Authors
Ming Cheng
Shufeng Xiong
Fei Li
Pan Liang
Jianbo Gao
Publication date
01-12-2021
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2021
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/s12911-021-01717-1

Other articles of this Issue 1/2021

BMC Medical Informatics and Decision Making 1/2021 Go to the issue