Skip to main content
Top
Published in: Orphanet Journal of Rare Diseases 1/2018

Open Access 01-12-2018 | Research

Next generation phenotyping using narrative reports in a rare disease clinical data warehouse

Authors: Nicolas Garcelon, Antoine Neuraz, Rémi Salomon, Nadia Bahi-Buisson, Jeanne Amiel, Capucine Picard, Nizar Mahlaoui, Vincent Benoit, Anita Burgun, Bastien Rance

Published in: Orphanet Journal of Rare Diseases | Issue 1/2018

Login to get access

Abstract

Background

Secondary use of data collected in Electronic Health Records opens perspectives for increasing our knowledge of rare diseases. The clinical data warehouse (named Dr. Warehouse) at the Necker-Enfants Malades Children’s Hospital contains data collected during normal care for thousands of patients. Dr. Warehouse is oriented toward the exploration of clinical narratives. In this study, we present our method to find phenotypes associated with diseases of interest.

Methods

We leveraged the frequency and TF-IDF to explore the association between clinical phenotypes and rare diseases. We applied our method in six use cases: phenotypes associated with the Rett, Lowe, Silver Russell, Bardet-Biedl syndromes, DOCK8 deficiency and Activated PI3-kinase Delta Syndrome (APDS). We asked domain experts to evaluate the relevance of the top-50 (for frequency and TF-IDF) phenotypes identified by Dr. Warehouse and computed the average precision and mean average precision.

Results

Experts concluded that between 16 and 39 phenotypes could be considered as relevant in the top-50 phenotypes ranked by descending frequency discovered by Dr. Warehouse (resp. between 11 and 41 for TF-IDF). Average precision ranges from 0.55 to 0.91 for frequency and 0.52 to 0.95 for TF-IDF. Mean average precision was 0.79. Our study suggests that phenotypes identified in clinical narratives stored in Electronic Health Record can provide rare disease specialists with candidate phenotypes that can be used in addition to the literature.

Conclusions

Clinical Data Warehouses can be used to perform Next Generation Phenotyping, especially in the context of rare diseases. We have developed a method to detect phenotypes associated with a group of patients using medical concepts extracted from free-text clinical narratives.
Appendix
Available only for authorised users
Literature
1.
go back to reference Office of the National Coordinator for Health Information Technology Health Record Adoption: 2004-2014, Health IT Quick-Stat #50. [Internet]. 2015 Sep. Available from: dashboard.healthit.gov/quickstats/pages/physician-ehr-adoption-trends.php. Office of the National Coordinator for Health Information Technology Health Record Adoption: 2004-2014, Health IT Quick-Stat #50. [Internet]. 2015 Sep. Available from: dashboard.healthit.gov/quickstats/pages/physician-ehr-adoption-trends.php.
2.
go back to reference Adler-Milstein J, DesRoches CM, Kralovec P, Foster G, Worzala C, Charles D, et al. Electronic health record adoption in US hospitals: progress continues, but challenges persist. Health Aff Proj Hope. 2015;34:2174–80.CrossRef Adler-Milstein J, DesRoches CM, Kralovec P, Foster G, Worzala C, Charles D, et al. Electronic health record adoption in US hospitals: progress continues, but challenges persist. Health Aff Proj Hope. 2015;34:2174–80.CrossRef
3.
go back to reference Zapletal E, Rodon N, Grabar N, Degoulet P. Methodology of integration of a clinical data warehouse with a clinical information system: the HEGP case. Stud Health Technol Inform. 2010;160:193–7.PubMed Zapletal E, Rodon N, Grabar N, Degoulet P. Methodology of integration of a clinical data warehouse with a clinical information system: the HEGP case. Stud Health Technol Inform. 2010;160:193–7.PubMed
4.
go back to reference Murphy SN, Weber G, Mendis M, Gainer V, Chueh HC, Churchill S, et al. Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2). J Am Med Inform Assoc. 2010;17:124–30.CrossRefPubMedPubMedCentral Murphy SN, Weber G, Mendis M, Gainer V, Chueh HC, Churchill S, et al. Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2). J Am Med Inform Assoc. 2010;17:124–30.CrossRefPubMedPubMedCentral
5.
go back to reference Danciu I, Cowan JD, Basford M, Wang X, Saip A, Osgood S, et al. Secondary use of clinical data: the Vanderbilt approach. J Biomed Inform. 2014;52:28–35.CrossRefPubMedPubMedCentral Danciu I, Cowan JD, Basford M, Wang X, Saip A, Osgood S, et al. Secondary use of clinical data: the Vanderbilt approach. J Biomed Inform. 2014;52:28–35.CrossRefPubMedPubMedCentral
6.
go back to reference Raghavan P, Chen JL, Fosler-Lussier E, Lai AM. How essential are unstructured clinical narratives and information fusion to clinical trial recruitment? AMIA Jt Summits Transl Sci Proc. 2014;2014:218–23.PubMedPubMedCentral Raghavan P, Chen JL, Fosler-Lussier E, Lai AM. How essential are unstructured clinical narratives and information fusion to clinical trial recruitment? AMIA Jt Summits Transl Sci Proc. 2014;2014:218–23.PubMedPubMedCentral
7.
go back to reference Escudié J-B, Jannot A-S, Zapletal E, Cohen S, Malamut G, Burgun A, et al. Reviewing 741 patients records in two hours with FASTVISU. AMIA Annu Symp Proc. 2015;2015:553–9.PubMedPubMedCentral Escudié J-B, Jannot A-S, Zapletal E, Cohen S, Malamut G, Burgun A, et al. Reviewing 741 patients records in two hours with FASTVISU. AMIA Annu Symp Proc. 2015;2015:553–9.PubMedPubMedCentral
8.
go back to reference Choquet R, Maaroufi M, de Carrara A, Messiaen C, Luigi E, Landais P. A methodology for a minimum data set for rare diseases to support national centers of excellence for healthcare and research. J Am Med Inform Assoc. 2015;22:76–85.CrossRefPubMed Choquet R, Maaroufi M, de Carrara A, Messiaen C, Luigi E, Landais P. A methodology for a minimum data set for rare diseases to support national centers of excellence for healthcare and research. J Am Med Inform Assoc. 2015;22:76–85.CrossRefPubMed
11.
go back to reference Picard C, Al-Herz W, Bousfiha A, Casanova J-L, Chatila T, Conley ME, et al. Primary immunodeficiency diseases: an update on the classification from the International Union of Immunological Societies Expert Committee for primary immunodeficiency 2015. J Clin Immunol. 2015;35:696–726.CrossRefPubMedPubMedCentral Picard C, Al-Herz W, Bousfiha A, Casanova J-L, Chatila T, Conley ME, et al. Primary immunodeficiency diseases: an update on the classification from the International Union of Immunological Societies Expert Committee for primary immunodeficiency 2015. J Clin Immunol. 2015;35:696–726.CrossRefPubMedPubMedCentral
17.
go back to reference Lindberg DA, Humphreys BL, McCray AT. The unified medical language system. Methods Inf Med. 1993;32:281–91.CrossRefPubMed Lindberg DA, Humphreys BL, McCray AT. The unified medical language system. Methods Inf Med. 1993;32:281–91.CrossRefPubMed
18.
20.
go back to reference Harkema H, Dowling JN, Thornblade T, Chapman WW. Context: an algorithm for determining negation, experiencer, and temporal status from clinical reports. J Biomed Inform. 2009;42:839–51.CrossRefPubMedPubMedCentral Harkema H, Dowling JN, Thornblade T, Chapman WW. Context: an algorithm for determining negation, experiencer, and temporal status from clinical reports. J Biomed Inform. 2009;42:839–51.CrossRefPubMedPubMedCentral
21.
go back to reference Chapman WW, Hillert D, Velupillai S, Kvist M, Skeppstedt M, Chapman BE, et al. Extending the NegEx lexicon for multiple languages. Stud Health Technol Inform. 2013;192:677–81.PubMedPubMedCentral Chapman WW, Hillert D, Velupillai S, Kvist M, Skeppstedt M, Chapman BE, et al. Extending the NegEx lexicon for multiple languages. Stud Health Technol Inform. 2013;192:677–81.PubMedPubMedCentral
22.
go back to reference Garcelon N, Neuraz A, Benoit V, Salomon R, Burgun A. Improving a full text search engine: the importance of negation detection and family history context to identify cases in a biomedical data warehouse. J Am Med Inform Assoc. Garcelon N, Neuraz A, Benoit V, Salomon R, Burgun A. Improving a full text search engine: the importance of negation detection and family history context to identify cases in a biomedical data warehouse. J Am Med Inform Assoc.
24.
go back to reference Bahi-Buisson N. Genetically determined encephalopathy: Rett syndrome. Handb Clin Neurol. 2013;111:281–6.CrossRefPubMed Bahi-Buisson N. Genetically determined encephalopathy: Rett syndrome. Handb Clin Neurol. 2013;111:281–6.CrossRefPubMed
25.
go back to reference Budden SS, Gunness ME. Possible mechanisms of osteopenia in Rett syndrome: bone histomorphometric studies. J Child Neurol. 2003;18:698–702.CrossRefPubMed Budden SS, Gunness ME. Possible mechanisms of osteopenia in Rett syndrome: bone histomorphometric studies. J Child Neurol. 2003;18:698–702.CrossRefPubMed
27.
go back to reference Jefferson A, Leonard H, Siafarikas A, Woodhead H, Fyfe S, Ward LM, et al. Clinical guidelines for Management of Bone Health in Rett syndrome based on expert consensus and available evidence. PLoS One. 2016;11(2):e0146824. https://doi.org/10.1371/journal.pone.0146824. eCollection 2016. PubMed PMID: 26849438; PubMed Central PMCID: PMC4743907. Jefferson A, Leonard H, Siafarikas A, Woodhead H, Fyfe S, Ward LM, et al. Clinical guidelines for Management of Bone Health in Rett syndrome based on expert consensus and available evidence. PLoS One. 2016;11(2):e0146824. https://​doi.​org/​10.​1371/​journal.​pone.​0146824. eCollection 2016. PubMed PMID: 26849438; PubMed Central PMCID: PMC4743907.
28.
go back to reference Lotan M, Reves-Siesel R, Eliav-Shalev RS, Merrick J. Osteoporosis in Rett syndrome: a case study presenting a novel management intervention for severe osteoporosis. Osteoporos. Osteoporos Int. 2013;24:3059–63.CrossRefPubMed Lotan M, Reves-Siesel R, Eliav-Shalev RS, Merrick J. Osteoporosis in Rett syndrome: a case study presenting a novel management intervention for severe osteoporosis. Osteoporos. Osteoporos Int. 2013;24:3059–63.CrossRefPubMed
30.
go back to reference Savova GK, Masanz JJ, Ogren PV, Zheng J, Sohn S, Kipper-Schuler KC, et al. Mayo clinical text analysis and knowledge extraction system (cTAKES): architecture, component evaluation and applications. J Am Med Inform Assoc. 2010;17:507–13.CrossRefPubMedPubMedCentral Savova GK, Masanz JJ, Ogren PV, Zheng J, Sohn S, Kipper-Schuler KC, et al. Mayo clinical text analysis and knowledge extraction system (cTAKES): architecture, component evaluation and applications. J Am Med Inform Assoc. 2010;17:507–13.CrossRefPubMedPubMedCentral
31.
go back to reference Roque FS, Jensen PB, Schmock H, Dalgaard M, Andreatta M, Hansen T, et al. Using electronic patient records to discover disease correlations and stratify patient cohorts. PLoS Comput Biol. 2011;7:e1002141.CrossRefPubMedPubMedCentral Roque FS, Jensen PB, Schmock H, Dalgaard M, Andreatta M, Hansen T, et al. Using electronic patient records to discover disease correlations and stratify patient cohorts. PLoS Comput Biol. 2011;7:e1002141.CrossRefPubMedPubMedCentral
32.
go back to reference Deléger L, Grouin C, Zweigenbaum P. Extracting medication information from French clinical texts. Stud Health Technol Inform. 2010;160:949–53.PubMed Deléger L, Grouin C, Zweigenbaum P. Extracting medication information from French clinical texts. Stud Health Technol Inform. 2010;160:949–53.PubMed
33.
go back to reference Friedman C, Shagina L, Lussier Y, Hripcsak G. Automated encoding of clinical documents based on natural language processing. J Am Med Inform Assoc. 2004;11:392–402.CrossRefPubMedPubMedCentral Friedman C, Shagina L, Lussier Y, Hripcsak G. Automated encoding of clinical documents based on natural language processing. J Am Med Inform Assoc. 2004;11:392–402.CrossRefPubMedPubMedCentral
34.
go back to reference Aronson AR. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proc AMIA Symp 2001;2001:17–21. Aronson AR. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proc AMIA Symp 2001;2001:17–21.
35.
go back to reference Zeng QT, Goryachev S, Weiss S, Sordo M, Murphy SN, Lazarus R. Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system. BMC Med Inform Decis Mak. 2006;6:30.CrossRefPubMedPubMedCentral Zeng QT, Goryachev S, Weiss S, Sordo M, Murphy SN, Lazarus R. Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system. BMC Med Inform Decis Mak. 2006;6:30.CrossRefPubMedPubMedCentral
37.
go back to reference Friedman C, Rubin J, Brown J, Buntin M, Corn M, Etheredge L, et al. Toward a science of learning systems: a research agenda for the high-functioning learning health system. J Am Med Inform Assoc. 2015;22:43–50.PubMed Friedman C, Rubin J, Brown J, Buntin M, Corn M, Etheredge L, et al. Toward a science of learning systems: a research agenda for the high-functioning learning health system. J Am Med Inform Assoc. 2015;22:43–50.PubMed
38.
go back to reference Maaroufi M, Choquet R, Landais P, Jaulent M-C. Towards data integration automation for the French rare disease registry. AMIA Annu Symp Proc. 2015;2015:880–5.PubMedPubMedCentral Maaroufi M, Choquet R, Landais P, Jaulent M-C. Towards data integration automation for the French rare disease registry. AMIA Annu Symp Proc. 2015;2015:880–5.PubMedPubMedCentral
Metadata
Title
Next generation phenotyping using narrative reports in a rare disease clinical data warehouse
Authors
Nicolas Garcelon
Antoine Neuraz
Rémi Salomon
Nadia Bahi-Buisson
Jeanne Amiel
Capucine Picard
Nizar Mahlaoui
Vincent Benoit
Anita Burgun
Bastien Rance
Publication date
01-12-2018
Publisher
BioMed Central
Published in
Orphanet Journal of Rare Diseases / Issue 1/2018
Electronic ISSN: 1750-1172
DOI
https://doi.org/10.1186/s13023-018-0830-6

Other articles of this Issue 1/2018

Orphanet Journal of Rare Diseases 1/2018 Go to the issue