Published in:
Open Access
01-12-2017 | Research article
Enriching the international clinical nomenclature with Chinese daily used synonyms and concept recognition in physician notes
Authors:
Rui Zhang, Jialin Liu, Yong Huang, Miye Wang, Qingke Shi, Jun Chen, Zhi Zeng
Published in:
BMC Medical Informatics and Decision Making
|
Issue 1/2017
Login to get access
Abstract
Background
It has been shown that the entities in everyday clinical text are often expressed in a way that varies from how they are expressed in the nomenclature. Owing to lots of synonyms, abbreviations, medical jargons or even misspellings in the daily used physician notes in clinical information system (CIS), the terminology without enough synonyms may not be adequately suitable for the task of Chinese clinical term recognition.
Methods
This paper demonstrates a validated system to retrieve the Chinese term of clinical finding (CTCF) from CIS and map them to the corresponding concepts of international clinical nomenclature, such as SNOMED CT. The system focuses on the SNOMED CT with Chinese synonyms enrichment (SCCSE). The literal similarity and the diagnosis-related similarity metrics were used for concept mapping. Two CTCF recognition methods, the rule- and terminology-based approach (RTBA) and the conditional random field machine learner (CRF), were adopted to identify the concepts in physician notes. The system was validated against the history of present illness annotated by clinical experts. The RTBA and CRF could be combined to predict new CTCFs besides SCCSE persistently.
Results
Around 59,000 CTCF candidates were accepted as valid and 39,000 of them occurred at least once in the history of present illness. 3,729 of them were accordant with the description in referenced Chinese clinical nomenclature, which could cross map to other international nomenclature such as SNOMED CT. With the hybrid similarity metrics, another 7,454 valid CTCFs (synonyms) were succeeded in concept mapping. For CTCF recognition in physician notes, a series of experiments were performed to find out the best CRF feature set, which gained an F-score of 0.887. The RTBA achieved a better F-score of 0.919 by the CTCF dictionary created in this research.
Conclusions
This research demonstrated that it is feasible to help the SNOMED CT with Chinese synonyms enrichment based on physician notes in CIS. With continuous maintenance of SCCSE, the CTCFs could be precisely retrieved from free text, and the CTCFs arranged in semantic hierarchy of SNOMED CT could greatly improve the meaningful use of electronic health record in China. The methodology is also useful for clinical synonyms enrichment in other languages.