Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2017

Open Access 01-12-2017 | Research article

Empirical advances with text mining of electronic health records

Authors: T. Delespierre, P. Denormandie, A. Bar-Hen, L. Josseran

Published in: BMC Medical Informatics and Decision Making | Issue 1/2017

Login to get access

Abstract

Background

Korian is a private group specializing in medical accommodations for elderly and dependent people. A professional data warehouse (DWH) established in 2010 hosts all of the residents’ data. Inside this information system (IS), clinical narratives (CNs) were used only by medical staff as a residents’ care linking tool.
The objective of this study was to show that, through qualitative and quantitative textual analysis of a relatively small physiotherapy and well-defined CN sample, it was possible to build a physiotherapy corpus and, through this process, generate a new body of knowledge by adding relevant information to describe the residents’ care and lives.

Methods

Meaningful words were extracted through Standard Query Language (SQL) with the LIKE function and wildcards to perform pattern matching, followed by text mining and a word cloud using R® packages. Another step involved principal components and multiple correspondence analyses, plus clustering on the same residents’ sample as well as on other health data using a health model measuring the residents’ care level needs.

Results

By combining these techniques, physiotherapy treatments could be characterized by a list of constructed keywords, and the residents’ health characteristics were built. Feeding defects or health outlier groups could be detected, physiotherapy residents’ data and their health data were matched, and differences in health situations showed qualitative and quantitative differences in physiotherapy narratives.

Conclusions

This textual experiment using a textual process in two stages showed that text mining and data mining techniques provide convenient tools to improve residents’ health and quality of care by adding new, simple, useable data to the electronic health record (EHR). When used with a normalized physiotherapy problem list, text mining through information extraction (IE), named entity recognition (NER) and data mining (DM) can provide a real advantage to describe health care, adding new medical material and helping to integrate the EHR system into the health staff work environment.
Appendix
Available only for authorised users
Footnotes
1
n < 30. We removed the real number for NH de-identification and for data harmonization.
 
Literature
1.
go back to reference Maas ML, Delaney C. Nursing process outcome linkage research: issues, current status, and health policy implications. Med Care. 2004;42(2):II-40–8. Maas ML, Delaney C. Nursing process outcome linkage research: issues, current status, and health policy implications. Med Care. 2004;42(2):II-40–8.
2.
go back to reference Ventres W, Kooienga S, Vuckovic N, et al. Physicians, Patients, and the Electronic Health Record: An Ethnographic Analysis Annals of Family Medecine, n°2 March/April. 2006;4:124–32. www.annfammed.org. Ventres W, Kooienga S, Vuckovic N, et al. Physicians, Patients, and the Electronic Health Record: An Ethnographic Analysis Annals of Family Medecine, n°2 March/April. 2006;4:124–32. www.​annfammed.​org.
3.
go back to reference Mc Ginn CA, Grenier S, Duplantie J, et al. Comparison of user groups’ perspectives of barriers and facilitators to implementing electronic health records: a systematic review. BMC Med. 2011;9:46.CrossRef Mc Ginn CA, Grenier S, Duplantie J, et al. Comparison of user groups’ perspectives of barriers and facilitators to implementing electronic health records: a systematic review. BMC Med. 2011;9:46.CrossRef
4.
go back to reference Cebul RD, Love TE, Jain AK, et al. Electronic health records and quality of diabetes care. N Engl J Med. 2011;365:825–33.CrossRefPubMed Cebul RD, Love TE, Jain AK, et al. Electronic health records and quality of diabetes care. N Engl J Med. 2011;365:825–33.CrossRefPubMed
5.
6.
go back to reference Zangara G, Corso PP, Cangemi F, et al. A cloud based architecture to support electronic health report. Stud Health Technol Inform. 2014;207:380–9.PubMed Zangara G, Corso PP, Cangemi F, et al. A cloud based architecture to support electronic health report. Stud Health Technol Inform. 2014;207:380–9.PubMed
8.
go back to reference SM Meystre, GK Savova, KC Kipper-Schuler et al. Extracting information from textual documents in the electronic health record : a review of recent research IMIA yearbook of medical informatics 2008. SM Meystre, GK Savova, KC Kipper-Schuler et al. Extracting information from textual documents in the electronic health record : a review of recent research IMIA yearbook of medical informatics 2008.
9.
go back to reference Hornberger J. Electronic health records: a guide for clinicians and administrators. Book and media review. JAMA. 2009;301:110.CrossRef Hornberger J. Electronic health records: a guide for clinicians and administrators. Book and media review. JAMA. 2009;301:110.CrossRef
10.
go back to reference Savova GK, Masanz JJ, Ogren PV. Mayo clinical text analysis and knowledge extraction system (cTAKES): architecture, component evaluation and applications. J Am Med Inform Assoc. 2010;17:507e513. doi:10.1136/jamia.2009.001560.CrossRef Savova GK, Masanz JJ, Ogren PV. Mayo clinical text analysis and knowledge extraction system (cTAKES): architecture, component evaluation and applications. J Am Med Inform Assoc. 2010;17:507e513. doi:10.​1136/​jamia.​2009.​001560.CrossRef
14.
go back to reference Delespierre T, Denormandie P, Josseran L. New methods to evaluate physiotherapy care in nursing homes. JNHR the Journal of Nursing Home Research International Working Group December 2-3, 2015 Toulouse, France Vol 1 2015 OC 36 p30. Delespierre T, Denormandie P, Josseran L. New methods to evaluate physiotherapy care in nursing homes. JNHR the Journal of Nursing Home Research International Working Group December  2-3, 2015 Toulouse, France Vol 1 2015 OC 36 p30.
15.
go back to reference Min Song Opinion: Text Mining in the Clinic. The Scientist (April 1, 2013). Min Song Opinion: Text Mining in the Clinic. The Scientist (April 1, 2013).
17.
go back to reference ST Wu, H Liu, D li et al. Unified medical language system term occurrences in clinical notes: a large-scale corpus analysis J Am Med Inform Assoc 2012; 19:e149-e156, DOI 10.1136/amiajnl-2011-000744. ST Wu, H Liu, D li et al. Unified medical language system term occurrences in clinical notes: a large-scale corpus analysis J Am Med Inform Assoc 2012; 19:e149-e156, DOI 10.​1136/​amiajnl-2011-000744.
18.
go back to reference Biro S, Williamson T, Leggett JA, et al. Utility of linking primary care electronic medical records with Canadian census data to study the determinants of chronic disease: an example based on socioeconomic status and obesity. BMC Med Inform Decis Mak. 2016;16:32. doi:10.1186/s12911-016-0272-9.CrossRefPubMedPubMedCentral Biro S, Williamson T, Leggett JA, et al. Utility of linking primary care electronic medical records with Canadian census data to study the determinants of chronic disease: an example based on socioeconomic status and obesity. BMC Med Inform Decis Mak. 2016;16:32. doi:10.​1186/​s12911-016-0272-9.CrossRefPubMedPubMedCentral
26.
go back to reference Lee TT, Liu CY, Kuo Y-H, et al. Application of data mining to the identification of critical factors in patient falls using a web-based reporting system. Int J Med Inform. 2011;80(2):141–50. Special Issue: Security in Health Information Systems. February 2011CrossRefPubMed Lee TT, Liu CY, Kuo Y-H, et al. Application of data mining to the identification of critical factors in patient falls using a web-based reporting system. Int J Med Inform. 2011;80(2):141–50. Special Issue: Security in Health Information Systems. February 2011CrossRefPubMed
29.
go back to reference Office of Inspector General J G Brown Physical And Occupational Therapy in Nursing Homes Medical Necessity and Quality of Care. Department of Health and Human Services OEI-09-97-00121 1999. Office of Inspector General J G Brown Physical And Occupational Therapy in Nursing Homes Medical Necessity and Quality of Care. Department of Health and Human Services OEI-09-97-00121 1999.
30.
go back to reference JM Ducoudray, Y Eon, C Le Provost et al. Le modèle PATHOS, Guide d’utilisation 2017 rédigé par la CNAMTS (Caisse Nationale d’Assurance Maladie des Travailleurs Salariés) et le SNGC (Syndicat National de Gérontologie Clinique). JM Ducoudray, Y Eon, C Le Provost et al. Le modèle PATHOS, Guide d’utilisation 2017 rédigé par la CNAMTS (Caisse Nationale d’Assurance Maladie des Travailleurs Salariés) et le SNGC (Syndicat National de Gérontologie Clinique).
31.
go back to reference Krefis AC, Schwarz NG, Nkrumah B, et al. Principal component analysis of socioeconomic factors and their association with malaria in children from the Ashanti region. Ghana Malar J. 2010;9:201.CrossRefPubMed Krefis AC, Schwarz NG, Nkrumah B, et al. Principal component analysis of socioeconomic factors and their association with malaria in children from the Ashanti region. Ghana Malar J. 2010;9:201.CrossRefPubMed
32.
go back to reference Ahmed SA, Siddiqi JS, Quaiser S. Principal component analysis to explore climatic variability that facilitates the emergence of dengue outbreak in Karachi. Pak J Meteorol. 2014;11(21):1. Ahmed SA, Siddiqi JS, Quaiser S. Principal component analysis to explore climatic variability that facilitates the emergence of dengue outbreak in Karachi. Pak J Meteorol. 2014;11(21):1.
34.
go back to reference Ayele D, Zewotir T, Mwambi H. Multiple correspondence analysis as a tool for analysis of large health surveys in African settings. Afr Health Sci. 2014;14(4):1036.CrossRefPubMedPubMedCentral Ayele D, Zewotir T, Mwambi H. Multiple correspondence analysis as a tool for analysis of large health surveys in African settings. Afr Health Sci. 2014;14(4):1036.CrossRefPubMedPubMedCentral
35.
go back to reference P Soares Costa, N Correia Santos, P Cunha et al. The Use of Multiple Correspondence Analysis to Explore Associations between Categories of Qualitative Variables in Healthy Ageing Hindawi Publishing Corporation Journal of Aging Research Volume 2013, Article ID 302163, 12 pages http://dx.doi.org/10.1155/2013/302163 (Accessed 28 July 2016). P Soares Costa, N Correia Santos, P Cunha et al. The Use of Multiple Correspondence Analysis to Explore Associations between Categories of Qualitative Variables in Healthy Ageing Hindawi Publishing Corporation Journal of Aging Research Volume 2013, Article ID 302163, 12 pages http://​dx.​doi.​org/​10.​1155/​2013/​302163 (Accessed 28 July 2016).
36.
go back to reference F Husson, J Josse, J Pagès. Principal component methods – hierarchical clustering – partitional clustering : why would we need to choose for visualizing data ? Technical report Agrocampus 2010. F Husson, J Josse, J Pagès. Principal component methods – hierarchical clustering – partitional clustering : why would we need to choose for visualizing data ? Technical report Agrocampus 2010.
38.
go back to reference Cheng B-W, Chang C-L, Liu I-S. Enhancing care services quality of nursing homes using data mining. Total Qual Manage Bus Excell. July 2005;16(5):575–96.CrossRef Cheng B-W, Chang C-L, Liu I-S. Enhancing care services quality of nursing homes using data mining. Total Qual Manage Bus Excell. July 2005;16(5):575–96.CrossRef
44.
go back to reference Dressel K, Schüle S. Using Word Clouds for Risk Perception in the Field of Public Health – the Case of Vector-Borne Diseases. In: Planet@Risk, Davos: Global Risk Forum GRF Davos. 2014;2(2):85-88. Dressel K, Schüle S. Using Word Clouds for Risk Perception in the Field of Public Health – the Case of Vector-Borne Diseases. In: Planet@Risk, Davos: Global Risk Forum GRF Davos. 2014;2(2):85-88.
46.
47.
go back to reference Ford E, Nicholson A, Koeling R, et al. Optimising the use of electronic health records to estimate the incidence of rheumatoid arthritis in primary care: what information is hidden in free text? BMC Med Res Methodol. 2013;13:105. Volume 8 Issue 2 e54878CrossRefPubMedPubMedCentral Ford E, Nicholson A, Koeling R, et al. Optimising the use of electronic health records to estimate the incidence of rheumatoid arthritis in primary care: what information is hidden in free text? BMC Med Res Methodol. 2013;13:105. Volume 8 Issue 2 e54878CrossRefPubMedPubMedCentral
49.
go back to reference MM Cruz-Cunha, IM Miranda, P Conçales. Handbook on Research on ICT for Human-Centered Healthcare and Social Care Services 2013, IGI Global. MM Cruz-Cunha, IM Miranda, P Conçales. Handbook on Research on ICT for Human-Centered Healthcare and Social Care Services 2013, IGI Global.
53.
go back to reference Bowman S, Rhia MJ, Fahima CCS. Impact of electronic health record systems on information integrity: quality and safety implications. Perspect Health Inf Manag. 2013 Fall; 10(Fall): 1c. Bowman S, Rhia MJ, Fahima CCS. Impact of electronic health record systems on information integrity: quality and safety implications. Perspect Health Inf Manag. 2013 Fall; 10(Fall): 1c.
Metadata
Title
Empirical advances with text mining of electronic health records
Authors
T. Delespierre
P. Denormandie
A. Bar-Hen
L. Josseran
Publication date
01-12-2017
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2017
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/s12911-017-0519-0

Other articles of this Issue 1/2017

BMC Medical Informatics and Decision Making 1/2017 Go to the issue