Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2017

Open Access 01-12-2017 | Research article

Enriching the international clinical nomenclature with Chinese daily used synonyms and concept recognition in physician notes

Authors: Rui Zhang, Jialin Liu, Yong Huang, Miye Wang, Qingke Shi, Jun Chen, Zhi Zeng

Published in: BMC Medical Informatics and Decision Making | Issue 1/2017

Login to get access

Abstract

Background

It has been shown that the entities in everyday clinical text are often expressed in a way that varies from how they are expressed in the nomenclature. Owing to lots of synonyms, abbreviations, medical jargons or even misspellings in the daily used physician notes in clinical information system (CIS), the terminology without enough synonyms may not be adequately suitable for the task of Chinese clinical term recognition.

Methods

This paper demonstrates a validated system to retrieve the Chinese term of clinical finding (CTCF) from CIS and map them to the corresponding concepts of international clinical nomenclature, such as SNOMED CT. The system focuses on the SNOMED CT with Chinese synonyms enrichment (SCCSE). The literal similarity and the diagnosis-related similarity metrics were used for concept mapping. Two CTCF recognition methods, the rule- and terminology-based approach (RTBA) and the conditional random field machine learner (CRF), were adopted to identify the concepts in physician notes. The system was validated against the history of present illness annotated by clinical experts. The RTBA and CRF could be combined to predict new CTCFs besides SCCSE persistently.

Results

Around 59,000 CTCF candidates were accepted as valid and 39,000 of them occurred at least once in the history of present illness. 3,729 of them were accordant with the description in referenced Chinese clinical nomenclature, which could cross map to other international nomenclature such as SNOMED CT. With the hybrid similarity metrics, another 7,454 valid CTCFs (synonyms) were succeeded in concept mapping. For CTCF recognition in physician notes, a series of experiments were performed to find out the best CRF feature set, which gained an F-score of 0.887. The RTBA achieved a better F-score of 0.919 by the CTCF dictionary created in this research.

Conclusions

This research demonstrated that it is feasible to help the SNOMED CT with Chinese synonyms enrichment based on physician notes in CIS. With continuous maintenance of SCCSE, the CTCFs could be precisely retrieved from free text, and the CTCFs arranged in semantic hierarchy of SNOMED CT could greatly improve the meaningful use of electronic health record in China. The methodology is also useful for clinical synonyms enrichment in other languages.
Literature
3.
go back to reference International Health Terminology Standards Development Organisation. SNOMED clinical terms user guide. 2012. International Health Terminology Standards Development Organisation. SNOMED clinical terms user guide. 2012.
4.
go back to reference Skeppstedt M, Kvist M, Dalianis H. Rule-based entity recognition and coverage of SNOMED CT in Swedish clinical text. Eur Lang Resour Assoc. 2012;1(3):1250–7. Skeppstedt M, Kvist M, Dalianis H. Rule-based entity recognition and coverage of SNOMED CT in Swedish clinical text. Eur Lang Resour Assoc. 2012;1(3):1250–7.
5.
go back to reference Cornet R, Keizer ND. Forty years of SNOMED: a literature review. BMC Med Inform Decis Mak. 2008;8(Suppl 1(24)):1–6. Cornet R, Keizer ND. Forty years of SNOMED: a literature review. BMC Med Inform Decis Mak. 2008;8(Suppl 1(24)):1–6.
6.
go back to reference Batool R, Khattak AM, Kim TS, et al. Automatic extraction and mapping of discharge summary’s concepts into SNOMED CT. Conf Proc IEEE Eng Med Biol Soc. 2013;2013:4195–8. doi: 10.1109/EMBC.2013.6610470. Batool R, Khattak AM, Kim TS, et al. Automatic extraction and mapping of discharge summary’s concepts into SNOMED CT. Conf Proc IEEE Eng Med Biol Soc. 2013;2013:4195–8. doi: 10.​1109/​EMBC.​2013.​6610470.
7.
go back to reference Heinze DT, Morsch ML, Holbrook J. Mining free-text medical records. J Am Med Inform Assoc. 2001;8(1):254–8. Heinze DT, Morsch ML, Holbrook J. Mining free-text medical records. J Am Med Inform Assoc. 2001;8(1):254–8.
8.
go back to reference Claveau V. Translation of biomedical terms by inferring rewriting rules. Information retrieval in biomedicine: natural language processing for knowledge integration, IGI - global. 2009. Chap 6. Claveau V. Translation of biomedical terms by inferring rewriting rules. Information retrieval in biomedicine: natural language processing for knowledge integration, IGI - global. 2009. Chap 6.
9.
go back to reference Yuwen S, Yang X. Research on the clinical terminology construction based on SNOMED. Seventh International Conference on Fuzzy Systems and Knowledge Discovery. IEEE. 2010;5:2224–8. doi: 10.1109/FSKD.2010.5569538. Yuwen S, Yang X. Research on the clinical terminology construction based on SNOMED. Seventh International Conference on Fuzzy Systems and Knowledge Discovery. IEEE. 2010;5:2224–8. doi: 10.​1109/​FSKD.​2010.​5569538.
12.
go back to reference Kim TY, Hardiker N, Coenen A. Inter-terminology mapping of nursing problems. J Biomed Inform. 2014;49(6):213–20.CrossRefPubMed Kim TY, Hardiker N, Coenen A. Inter-terminology mapping of nursing problems. J Biomed Inform. 2014;49(6):213–20.CrossRefPubMed
14.
go back to reference Perez-De-Viñaspre O, Oronoz M. SNOMED CT in a language isolate: an algorithm for a semiautomatic translation. BMC Med Inform Decis Mak. 2015;15 Suppl 2:1–14. Perez-De-Viñaspre O, Oronoz M. SNOMED CT in a language isolate: an algorithm for a semiautomatic translation. BMC Med Inform Decis Mak. 2015;15 Suppl 2:1–14.
15.
go back to reference Skeppelstedt M, Dalianis H. Using SNOMED CT for high precision entity recognition in Swedish clinical text. In: 23rd international conference of the European federation for medical informatics. 2011. Skeppelstedt M, Dalianis H. Using SNOMED CT for high precision entity recognition in Swedish clinical text. In: 23rd international conference of the European federation for medical informatics. 2011.
16.
17.
go back to reference Miñarrogiménez JA, Hellrich J, Schulz S. Acquisition of character translation rules for supporting SNOMED CT localizations. Stud Health Technol Inform. 2015;210:597–601. doi:10.3233/978-1-61499-512-8-597. Miñarrogiménez JA, Hellrich J, Schulz S. Acquisition of character translation rules for supporting SNOMED CT localizations. Stud Health Technol Inform. 2015;210:597–601. doi:10.​3233/​978-1-61499-512-8-597.
18.
go back to reference Danya L, Tiejun H, Junlian L, et al. Construction and application of the Chinese unified medical language system. J Intelligence. 2011;30(2):147–51. Danya L, Tiejun H, Junlian L, et al. Construction and application of the Chinese unified medical language system. J Intelligence. 2011;30(2):147–51.
24.
go back to reference Kipper-Schuler K, Kaggal V, Masanz J, et al. System evaluation on a named entity corpus from clinical notes. In: Language resources and evaluation conference, LREC. 2008. p. 3001–7. Kipper-Schuler K, Kaggal V, Masanz J, et al. System evaluation on a named entity corpus from clinical notes. In: Language resources and evaluation conference, LREC. 2008. p. 3001–7.
25.
go back to reference Wang Y, Yu Z, Li C, et al. Supervised methods for symptom name recognition in free-text clinical records of traditional Chinese medicine: an empirical study. J Biomed Inform. 2014;47(2):91–104.CrossRefPubMed Wang Y, Yu Z, Li C, et al. Supervised methods for symptom name recognition in free-text clinical records of traditional Chinese medicine: an empirical study. J Biomed Inform. 2014;47(2):91–104.CrossRefPubMed
28.
30.
go back to reference Nie J, Gao J, Zhang J, et al. On the Use of Words and N-grams for Chinese Information Retrieval. Fifth International Workshop on Information Retrieval with Asian Languages. Hong Kong. 2000:141-148. doi: 10.1145/355214.355235. Nie J, Gao J, Zhang J, et al. On the Use of Words and N-grams for Chinese Information Retrieval. Fifth International Workshop on Information Retrieval with Asian Languages. Hong Kong. 2000:141-148. doi: 10.​1145/​355214.​355235.
32.
go back to reference Sogueroruiz C, Hindberg K, Rojoalvarez JL, et al. Support vector feature selection for early detection of anastomosis leakage from Bag-of-words in electronic health records. IEEE J Biomed Health Informatics. 2014;20(5):1404–15. doi:10.1109/JBHI.2014.2361688.CrossRef Sogueroruiz C, Hindberg K, Rojoalvarez JL, et al. Support vector feature selection for early detection of anastomosis leakage from Bag-of-words in electronic health records. IEEE J Biomed Health Informatics. 2014;20(5):1404–15. doi:10.​1109/​JBHI.​2014.​2361688.CrossRef
33.
34.
go back to reference Mortensen JM, Musen MA, Noy NF. Crowdsourcing the verification of relationships in biomedical ontologies. AMIA Annu Symp Proc. 2013;2013:1020–9.PubMedPubMedCentral Mortensen JM, Musen MA, Noy NF. Crowdsourcing the verification of relationships in biomedical ontologies. AMIA Annu Symp Proc. 2013;2013:1020–9.PubMedPubMedCentral
35.
go back to reference Kontonatsios G, Mihăilă C, Korkontzelos I, et al. A Hybrid Approach to Compiling Bilingual Dictionaries of Medical Terms from Parallel Corpora. 2014;8791(2425):57–69. SLSP 2014, At Grenoble, France, Volume: Statistical Language and Speech Processing, Second International Conference. doi: 10.1007/978-3-319-11397-5_4. Kontonatsios G, Mihăilă C, Korkontzelos I, et al. A Hybrid Approach to Compiling Bilingual Dictionaries of Medical Terms from Parallel Corpora. 2014;8791(2425):57–69. SLSP 2014, At Grenoble, France, Volume: Statistical Language and Speech Processing, Second International Conference. doi: 10.1007/978-3-319-11397-5_4.
37.
go back to reference Henriksson A, Skeppstedt M, Kvist M, et al. Corpus-driven terminology development: populating Swedish SNOMED CT with synonyms extracted from electronic health records. In: The workshop on biomedical natural language processing. 2013. p. 36–44. Henriksson A, Skeppstedt M, Kvist M, et al. Corpus-driven terminology development: populating Swedish SNOMED CT with synonyms extracted from electronic health records. In: The workshop on biomedical natural language processing. 2013. p. 36–44.
38.
go back to reference Henriksson A, Conway M, Duneld M, et al. Identifying synonymy between SNOMED clinical terms of varying length using distributional analysis of electronic health records. AMIA Proc AMIA Ann Symp AMIA Symp. 2013;2013:600–9. Henriksson A, Conway M, Duneld M, et al. Identifying synonymy between SNOMED clinical terms of varying length using distributional analysis of electronic health records. AMIA Proc AMIA Ann Symp AMIA Symp. 2013;2013:600–9.
39.
go back to reference Grabar N, Varoutas PC, Rizand P, et al. Automatic acquisition of synonym resources and assessment of their impact on the enhanced search in EHRs. Methods Inf Med. 2009;48(2):149–54. doi:10.3414/ME9213.PubMed Grabar N, Varoutas PC, Rizand P, et al. Automatic acquisition of synonym resources and assessment of their impact on the enhanced search in EHRs. Methods Inf Med. 2009;48(2):149–54. doi:10.​3414/​ME9213.PubMed
40.
go back to reference Schlegel DR, Crowner C, Elkin PL. Automatically expanding the synonym set of SNOMED CT using wikipedia. Stud Health Technol Inform. 2015;216:619–23.PubMed Schlegel DR, Crowner C, Elkin PL. Automatically expanding the synonym set of SNOMED CT using wikipedia. Stud Health Technol Inform. 2015;216:619–23.PubMed
41.
42.
go back to reference Wang Y, Patrick J. Cascading classifiers for named entity recognition in clinical notes. In: The Workshop on Biomedical Information Extraction. Borovets, Bulgaria; 2009. p. 42–49. Wang Y, Patrick J. Cascading classifiers for named entity recognition in clinical notes. In: The Workshop on Biomedical Information Extraction. Borovets, Bulgaria; 2009. p. 42–49.
44.
go back to reference Gurulingappa H, Hofmann-Apitius M, Fluck J. Concept identification and assertion classification in patient health records. In: Proceedings of the 2010 i2b2/VA Workshop on Challenges in Natural Language Processing for Clinical Data. Washington DC, USA; 2010. Gurulingappa H, Hofmann-Apitius M, Fluck J. Concept identification and assertion classification in patient health records. In: Proceedings of the 2010 i2b2/VA Workshop on Challenges in Natural Language Processing for Clinical Data. Washington DC, USA; 2010.
Metadata
Title
Enriching the international clinical nomenclature with Chinese daily used synonyms and concept recognition in physician notes
Authors
Rui Zhang
Jialin Liu
Yong Huang
Miye Wang
Qingke Shi
Jun Chen
Zhi Zeng
Publication date
01-12-2017
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2017
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/s12911-017-0455-z

Other articles of this Issue 1/2017

BMC Medical Informatics and Decision Making 1/2017 Go to the issue