Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 14/2020

Open Access 01-12-2020 | Heart Failure | Research

Lab indicators standardization method for the regional healthcare platform: a case study on heart failure

Authors: Ming Liang, ZhiXing Zhang, JiaYing Zhang, Tong Ruan, Qi Ye, Ping He

Published in: BMC Medical Informatics and Decision Making | Special Issue 14/2020

Login to get access

Abstract

Background

Laboratory indicator test results in electronic health records have been applied to many clinical big data analysis. However, it is quite common that the same laboratory examination item (i.e., lab indicator) is presented using different names in Chinese due to the translation problem and the habit problem of various hospitals, which results in distortion of analysis results.

Methods

A framework with a recall model and a binary classification model is proposed, which could reduce the alignment scale and improve the accuracy of lab indicator normalization. To reduce alignment scale, tf-idf is used for candidate selection. To assure the accuracy of output, we utilize enhanced sequential inference model for binary classification. And active learning is applied with a selection strategy which is proposed for reducing annotation cost.

Results

Since our indicator standardization method mainly focuses on Chinese indicator inconsistency, we perform our experiment on Shanghai Hospital Development Center and select clinical data from 8 hospitals. The method achieves a F1-score 92.08\(\%\) in our final binary classification. As for active learning, the new strategy proposed performs better than random baseline and could outperform the result trained on full data with only 43\(\%\) training data. A case study on heart failure clinic analysis conducted on the sub-dataset collected from SHDC shows that our proposed method is practical in the application with good performance.

Conclusion

This work demonstrates that the structure we proposed can be effectively applied to lab indicator normalization. And active learning is also suitable for this task for cost reduction. Such a method is also valuable in data cleaning, data mining, text extracting and entity alignment.
Literature
1.
go back to reference Arora S, Caughey MC, Misenheimer JA, Jones WM, Fish AC, Smith SC Jr, Stouffer GA, Kaul P. Elevated serum aspartate transaminase as a predictor of early mortality in patients with non-ST-segment elevation myocardial infarction. Circulation. 2017;136(suppl–1):A15577. Arora S, Caughey MC, Misenheimer JA, Jones WM, Fish AC, Smith SC Jr, Stouffer GA, Kaul P. Elevated serum aspartate transaminase as a predictor of early mortality in patients with non-ST-segment elevation myocardial infarction. Circulation. 2017;136(suppl–1):A15577.
2.
go back to reference Rong S, Niu X, Xiang EW, Wang H, Yang Q, Yu Y. A machine learning approach for instance matching based on similarity metrics. In: International semantic web conference. Springer, pp 460–475; 2012. Rong S, Niu X, Xiang EW, Wang H, Yang Q, Yu Y. A machine learning approach for instance matching based on similarity metrics. In: International semantic web conference. Springer, pp 460–475; 2012.
3.
go back to reference Elmagarmid AK, Ipeirotis PG, Verykios VS. Duplicate record detection: a survey. IEEE Trans Knowl Data Eng. 2006;19(1):1–16.CrossRef Elmagarmid AK, Ipeirotis PG, Verykios VS. Duplicate record detection: a survey. IEEE Trans Knowl Data Eng. 2006;19(1):1–16.CrossRef
4.
go back to reference Bilenko M, Mooney R, Cohen W, Ravikumar P, Fienberg S. Adaptive name matching in information integration. IEEE Intell Syst. 2003;18(5):16–23.CrossRef Bilenko M, Mooney R, Cohen W, Ravikumar P, Fienberg S. Adaptive name matching in information integration. IEEE Intell Syst. 2003;18(5):16–23.CrossRef
5.
go back to reference Suchanek FM, Abiteboul S, Senellart P. Paris: probabilistic alignment of relations, instances, and schema. Proc VLDB Endow. 2011;5(3):157–68.CrossRef Suchanek FM, Abiteboul S, Senellart P. Paris: probabilistic alignment of relations, instances, and schema. Proc VLDB Endow. 2011;5(3):157–68.CrossRef
6.
go back to reference Kong C, Gao M, Xu C, Fu Y, Qian W, Zhou A. Enali: entity alignment across multiple heterogeneous data sources. Front Comput Sci. 2019;13(1):157–69.CrossRef Kong C, Gao M, Xu C, Fu Y, Qian W, Zhou A. Enali: entity alignment across multiple heterogeneous data sources. Front Comput Sci. 2019;13(1):157–69.CrossRef
7.
go back to reference Hu W, Qu Y, Cheng G. Matching large ontologies: a divide-and-conquer approach. Data Knowl Eng. 2008;67(1):140–60.CrossRef Hu W, Qu Y, Cheng G. Matching large ontologies: a divide-and-conquer approach. Data Knowl Eng. 2008;67(1):140–60.CrossRef
8.
go back to reference Wang Z, Li J, Tang J. Boosting cross-lingual knowledge linking via concept annotation. In: Proceedings of the 23rd international joint conference on artificial intelligence. IJCAI, pp 2733–2739; 2013 Wang Z, Li J, Tang J. Boosting cross-lingual knowledge linking via concept annotation. In: Proceedings of the 23rd international joint conference on artificial intelligence. IJCAI, pp 2733–2739; 2013
9.
go back to reference Wang X, Liu K, He S, Liu S, Zhang Y, Zhao J. Multi-source knowledge bases entity alignment by leveraging semantic tags. Jisuanji Xuebao/Chin J Comput. 2017;40(3):701–11. Wang X, Liu K, He S, Liu S, Zhang Y, Zhao J. Multi-source knowledge bases entity alignment by leveraging semantic tags. Jisuanji Xuebao/Chin J Comput. 2017;40(3):701–11.
10.
go back to reference Ruan T, Wang M, Sun J, Wang T, Zeng L, Yin Y, Gao J. An automatic approach for constructing a knowledge base of symptoms in Chinese. J Biomed Semant. 2017;8(1):33.CrossRef Ruan T, Wang M, Sun J, Wang T, Zeng L, Yin Y, Gao J. An automatic approach for constructing a knowledge base of symptoms in Chinese. J Biomed Semant. 2017;8(1):33.CrossRef
11.
go back to reference Zhang Y, Wang X, Lai S, He S, Liu K, Zhao J, Lv X. Ontology matching with word embeddings. In: Sun M, Liu Z, Zhang M, Liun Y, editors. Chinese computational linguistics and natural language processing based on naturally annotated big data. Berlin: Springer; 2014. p. 34–45.CrossRef Zhang Y, Wang X, Lai S, He S, Liu K, Zhao J, Lv X. Ontology matching with word embeddings. In: Sun M, Liu Z, Zhang M, Liun Y, editors. Chinese computational linguistics and natural language processing based on naturally annotated big data. Berlin: Springer; 2014. p. 34–45.CrossRef
12.
go back to reference Kolyvakis P, Kalousis A, Kiritsis D. Deepalignment: unsupervised ontology matching with refined word vectors. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 1 (Long Papers), vol. 1. ACL, pp. 787–798; 2018. Kolyvakis P, Kalousis A, Kiritsis D. Deepalignment: unsupervised ontology matching with refined word vectors. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 1 (Long Papers), vol. 1. ACL, pp. 787–798; 2018.
13.
go back to reference Lei L, Zhou Y, Zhai J, Zhang L, Fang Z, He P, Gao J. An effective patient representation learning for time-series prediction tasks based on EHRS. In: IEEE international conference on bioinformatics and biomedicine, BIBM 2018, Madrid, Spain, December 3–6, 2018, pp 885–892; 2018. Lei L, Zhou Y, Zhai J, Zhang L, Fang Z, He P, Gao J. An effective patient representation learning for time-series prediction tasks based on EHRS. In: IEEE international conference on bioinformatics and biomedicine, BIBM 2018, Madrid, Spain, December 3–6, 2018, pp 885–892; 2018.
14.
go back to reference Kolyvakis P, Kalousis A, Smith B, Kiritsis D. Biomedical ontology alignment: an approach based on representation learning. J Biomed Semant. 2018;9(1):21.CrossRef Kolyvakis P, Kalousis A, Smith B, Kiritsis D. Biomedical ontology alignment: an approach based on representation learning. J Biomed Semant. 2018;9(1):21.CrossRef
15.
go back to reference Sun Z, Hu W, Zhang Q, Qu Y. Bootstrapping entity alignment with knowledge graph embedding. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence (IJCAI), pp 4396–4402. IJCAI; 2018 Sun Z, Hu W, Zhang Q, Qu Y. Bootstrapping entity alignment with knowledge graph embedding. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence (IJCAI), pp 4396–4402. IJCAI; 2018
16.
go back to reference Trisedya BD, Qi J, Zhang R. Entity alignment between knowledge graphs using attribute embeddings. Proc AAAI Conf Artif Intell. 2019;33:297–304. Trisedya BD, Qi J, Zhang R. Entity alignment between knowledge graphs using attribute embeddings. Proc AAAI Conf Artif Intell. 2019;33:297–304.
17.
go back to reference Cucerzan S. Large-scale named entity disambiguation based on Wikipedia data. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL). ACL, pp 708–716; 2007 Cucerzan S. Large-scale named entity disambiguation based on Wikipedia data. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL). ACL, pp 708–716; 2007
18.
go back to reference Han X, Zhao J. Nlpr\_kbp in tac 2009 kbp track: a two-stage method to entity linking. In: TAC. Citeseer; 2009. Han X, Zhao J. Nlpr\_kbp in tac 2009 kbp track: a two-stage method to entity linking. In: TAC. Citeseer; 2009.
19.
go back to reference Han X, Sun L, Zhao J. Collective entity linking in web text: a graph-based method. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 765–774; 2011. Han X, Sun L, Zhao J. Collective entity linking in web text: a graph-based method. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 765–774; 2011.
20.
go back to reference Varma V, Pingali P, Katragadda R, Krishna S, Ganesh S, Sarvabhotla K, Garapati H, Gopisetty H, Reddy VB, Reddy K et al. Iiit hyderabad at tac 2009. In: TAC; 2009. Varma V, Pingali P, Katragadda R, Krishna S, Ganesh S, Sarvabhotla K, Garapati H, Gopisetty H, Reddy VB, Reddy K et al. Iiit hyderabad at tac 2009. In: TAC; 2009.
21.
go back to reference Lehmann J, Monahan S, Nezda L, Jung A, Shi Y. LCCc approaches to knowledge base population at TAC 2010. In: TAC; 2010. Lehmann J, Monahan S, Nezda L, Jung A, Shi Y. LCCc approaches to knowledge base population at TAC 2010. In: TAC; 2010.
22.
go back to reference Moreno JG, Besançon R, Beaumont R, D’hondt E, Ligozat A-L, Rosset S, Tannier X, Grau B. Combining word and entity embeddings for entity linking. In: European semantic web conference. Springer, pp 337–352; 2017. Moreno JG, Besançon R, Beaumont R, D’hondt E, Ligozat A-L, Rosset S, Tannier X, Grau B. Combining word and entity embeddings for entity linking. In: European semantic web conference. Springer, pp 337–352; 2017.
23.
go back to reference Shen W, Wang J, Luo P, Wang M. LINDEN: linking named entities with knowledge base via semantic knowledge. In: Proceedings of the 21st international conference on world wide web. ACM, pp 449–458; 2012. Shen W, Wang J, Luo P, Wang M. LINDEN: linking named entities with knowledge base via semantic knowledge. In: Proceedings of the 21st international conference on world wide web. ACM, pp 449–458; 2012.
25.
go back to reference Settles B. Active learning literature survey. Technical report, University of Wisconsin-Madison Department of Computer Sciences; 2009. Settles B. Active learning literature survey. Technical report, University of Wisconsin-Madison Department of Computer Sciences; 2009.
26.
go back to reference Fu Y, Zhu X, Li B. A survey on instance selection for active learning. Knowl Inf Syst. 2013;35(2):249–83.CrossRef Fu Y, Zhu X, Li B. A survey on instance selection for active learning. Knowl Inf Syst. 2013;35(2):249–83.CrossRef
27.
28.
go back to reference Joshi AJ, Porikli F, Papanikolopoulos N (2009) Multi-class active learning for image classification. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 2372–2379. Joshi AJ, Porikli F, Papanikolopoulos N (2009) Multi-class active learning for image classification. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 2372–2379.
29.
go back to reference Hakkani-Tür D, Riccardi G, Gorin A. Active learning for automatic speech recognition. In: 2002 IEEE international conference on acoustics, speech, and signal processing, vol. 4. IEEE, p 3904; 2002. Hakkani-Tür D, Riccardi G, Gorin A. Active learning for automatic speech recognition. In: 2002 IEEE international conference on acoustics, speech, and signal processing, vol. 4. IEEE, p 3904; 2002.
30.
go back to reference Devlin J, Chang M-W, Lee K, Toutanova K. Bert: pre-training of deep bidirectional transformers for language understanding; 2018. arXiv:1810.04805. Devlin J, Chang M-W, Lee K, Toutanova K. Bert: pre-training of deep bidirectional transformers for language understanding; 2018. arXiv:​1810.​04805.
Metadata
Title
Lab indicators standardization method for the regional healthcare platform: a case study on heart failure
Authors
Ming Liang
ZhiXing Zhang
JiaYing Zhang
Tong Ruan
Qi Ye
Ping He
Publication date
01-12-2020
Publisher
BioMed Central
Keyword
Heart Failure
DOI
https://doi.org/10.1186/s12911-020-01324-6

Other articles of this Special Issue 14/2020

BMC Medical Informatics and Decision Making 14/2020 Go to the issue