Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2013

Open Access 01-12-2013 | Research article

Optimising the use of electronic health records to estimate the incidence of rheumatoid arthritis in primary care: what information is hidden in free text?

Published in: BMC Medical Research Methodology | Issue 1/2013

Login to get access

Abstract

Background

Primary care databases are a major source of data for epidemiological and health services research. However, most studies are based on coded information, ignoring information stored in free text. Using the early presentation of rheumatoid arthritis (RA) as an exemplar, our objective was to estimate the extent of data hidden within free text, using a keyword search.

Methods

We examined the electronic health records (EHRs) of 6,387 patients from the UK, aged 30 years and older, with a first coded diagnosis of RA between 2005 and 2008. We listed indicators for RA which were present in coded format and ran keyword searches for similar information held in free text. The frequency of indicator code groups and keywords from one year before to 14 days after RA diagnosis were compared, and temporal relationships examined.

Results

One or more keyword for RA was found in the free text in 29% of patients prior to the RA diagnostic code. Keywords for inflammatory arthritis diagnoses were present for 14% of patients whereas only 11% had a diagnostic code. Codes for synovitis were found in 3% of patients, but keywords were identified in an additional 17%. In 13% of patients there was evidence of a positive rheumatoid factor test in text only, uncoded. No gender differences were found. Keywords generally occurred close in time to the coded diagnosis of rheumatoid arthritis. They were often found under codes indicating letters and communications.

Conclusions

Potential cases may be missed or wrongly dated when coded data alone are used to identify patients with RA, as diagnostic suspicions are frequently confined to text. The use of EHRs to create disease registers or assess quality of care will be misleading if free text information is not taken into account. Methods to facilitate the automated processing of text need to be developed and implemented.
Appendix
Available only for authorised users
Literature
1.
go back to reference Blumenthal D: Launching HITECH. NEnglJ Med. 2010, 362 (5): 382-385. 10.1056/NEJMp0912825.CrossRef Blumenthal D: Launching HITECH. NEnglJ Med. 2010, 362 (5): 382-385. 10.1056/NEJMp0912825.CrossRef
2.
go back to reference Blumenthal D, Tavenner M: The “meaningful Use” regulation for electronic health records. NEnglJ Med. 2010, 363 (6): 501-504. 10.1056/NEJMp1006114.CrossRef Blumenthal D, Tavenner M: The “meaningful Use” regulation for electronic health records. NEnglJ Med. 2010, 363 (6): 501-504. 10.1056/NEJMp1006114.CrossRef
3.
go back to reference Department of Health: Liberating the NHS: An Information Revolution. A consultation on proposals. 2010, London, UK: Department of Health Department of Health: Liberating the NHS: An Information Revolution. A consultation on proposals. 2010, London, UK: Department of Health
4.
go back to reference Atreja A, Achkar J, Jain A, Harris C, Lashner B: Using technology to promote gastrointestinal outcomes research: a case for electronic health records. Am J Gastroenterol. 2008, 103 (9): 2171-2178. 10.1111/j.1572-0241.2008.01890.x.CrossRefPubMed Atreja A, Achkar J, Jain A, Harris C, Lashner B: Using technology to promote gastrointestinal outcomes research: a case for electronic health records. Am J Gastroenterol. 2008, 103 (9): 2171-2178. 10.1111/j.1572-0241.2008.01890.x.CrossRefPubMed
5.
go back to reference van Staa T-P, Goldacre B, Gulliford M, Cassell J, Pirmohamed M, Taweel A, Delaney B, Smeeth L: Pragmatic randomised trials using routine electronic health records: putting them to the test. BMJ. 2012, 344: e55-10.1136/bmj.e55.CrossRefPubMedCentral van Staa T-P, Goldacre B, Gulliford M, Cassell J, Pirmohamed M, Taweel A, Delaney B, Smeeth L: Pragmatic randomised trials using routine electronic health records: putting them to the test. BMJ. 2012, 344: e55-10.1136/bmj.e55.CrossRefPubMedCentral
6.
go back to reference British Medical Association: Quality and Outcomes Framework guidance for GMS contract 2011/12. 2011, London, UK: British Medical Association British Medical Association: Quality and Outcomes Framework guidance for GMS contract 2011/12. 2011, London, UK: British Medical Association
7.
go back to reference Nicholson A, Ford E, Davies K, Smith H, Rait G, Tate R, Peterson I, Cassell J: Optimising Use of electronic health records to describe the presentation of rheumatoid arthritis in primary care: a strategy for developing code lists. PLoS ONE. 2013, 8 (2): e54878-10.1371/journal.pone.0054878.CrossRefPubMedPubMedCentral Nicholson A, Ford E, Davies K, Smith H, Rait G, Tate R, Peterson I, Cassell J: Optimising Use of electronic health records to describe the presentation of rheumatoid arthritis in primary care: a strategy for developing code lists. PLoS ONE. 2013, 8 (2): e54878-10.1371/journal.pone.0054878.CrossRefPubMedPubMedCentral
8.
go back to reference Dave S, Petersen I: Creating medical and drug code lists to identify cases in primary care databases. PharmacoepidemiolDrug Saf. 2009 Dave S, Petersen I: Creating medical and drug code lists to identify cases in primary care databases. PharmacoepidemiolDrug Saf. 2009
9.
go back to reference Jordan K, Porcheret M, Kadam UT, Croft P: The use of general practice consultation databases in rheumatology research. Rheumatology. 2006, 45 (2): 126-128.CrossRefPubMed Jordan K, Porcheret M, Kadam UT, Croft P: The use of general practice consultation databases in rheumatology research. Rheumatology. 2006, 45 (2): 126-128.CrossRefPubMed
10.
go back to reference Manuel DG, Rosella LC, Stukel TA: Importance of accurately identifying disease in studies using electronic health records. BMJ. 2010, 341: c4226-10.1136/bmj.c4226.CrossRefPubMed Manuel DG, Rosella LC, Stukel TA: Importance of accurately identifying disease in studies using electronic health records. BMJ. 2010, 341: c4226-10.1136/bmj.c4226.CrossRefPubMed
11.
go back to reference Jordan K, Porcheret M, Croft P: Quality of morbidity coding in general practice computerized medical records: a systematic review. Fam Pract. 2004, 21 (4): 396-412. 10.1093/fampra/cmh409.CrossRefPubMed Jordan K, Porcheret M, Croft P: Quality of morbidity coding in general practice computerized medical records: a systematic review. Fam Pract. 2004, 21 (4): 396-412. 10.1093/fampra/cmh409.CrossRefPubMed
12.
go back to reference Rhodes ET, Gonzalez TV, Laffel LMB, Ludwig DS: Accuracy of administrative coding for type 2 diabetes in children, adolescents, and young adults. Diabetes Care. 2007, 30 (1): 141-143. 10.2337/dc06-1142.CrossRefPubMed Rhodes ET, Gonzalez TV, Laffel LMB, Ludwig DS: Accuracy of administrative coding for type 2 diabetes in children, adolescents, and young adults. Diabetes Care. 2007, 30 (1): 141-143. 10.2337/dc06-1142.CrossRefPubMed
13.
go back to reference Meystre SM, Savova GK, Kipper-Schuler KC, Hurdle JF: Extracting information from textual documents in the electronic health record: a review of recent research. YearbMed Inform. 2008, 128-144. Meystre SM, Savova GK, Kipper-Schuler KC, Hurdle JF: Extracting information from textual documents in the electronic health record: a review of recent research. YearbMed Inform. 2008, 128-144.
14.
go back to reference Nice: Rheumatoid arthritis. The management of rheumatoid arthritis in adults. Clinical guideline 79. 2009, London: NICE Nice: Rheumatoid arthritis. The management of rheumatoid arthritis in adults. Clinical guideline 79. 2009, London: NICE
15.
go back to reference Tate AR, Martin AG, Murray-Thomas T, Anderson SR, Cassell JA: Determining the date of diagnosis–is it a simple matter? The impact of different approaches to dating diagnosis on estimates of delayed care for ovarian cancer in UK primary care. BMC Med Res Methodol. 2009, 9: 42-10.1186/1471-2288-9-42.CrossRefPubMedPubMedCentral Tate AR, Martin AG, Murray-Thomas T, Anderson SR, Cassell JA: Determining the date of diagnosis–is it a simple matter? The impact of different approaches to dating diagnosis on estimates of delayed care for ovarian cancer in UK primary care. BMC Med Res Methodol. 2009, 9: 42-10.1186/1471-2288-9-42.CrossRefPubMedPubMedCentral
16.
go back to reference Pascoe SW, Neal RD, Heywood PL, Allgar VL, Miles JN, Stefoski-Mikeljevic J: Identifying patients with a cancer diagnosis using general practice medical records and cancer registry data. Fam Pract. 2008, 25 (4): 215-220. 10.1093/fampra/cmn023.CrossRefPubMed Pascoe SW, Neal RD, Heywood PL, Allgar VL, Miles JN, Stefoski-Mikeljevic J: Identifying patients with a cancer diagnosis using general practice medical records and cancer registry data. Fam Pract. 2008, 25 (4): 215-220. 10.1093/fampra/cmn023.CrossRefPubMed
17.
go back to reference Hillestad R, Bigelow J, Bower A, Girosi F, Meili R, Scoville R, Taylor R: Can ElectronicMedical record systems transform health care? Potential health benefits, savings. And costs. Health Aff. 2005, 24 (5): 1103-1107. 10.1377/hlthaff.24.5.1103.CrossRef Hillestad R, Bigelow J, Bower A, Girosi F, Meili R, Scoville R, Taylor R: Can ElectronicMedical record systems transform health care? Potential health benefits, savings. And costs. Health Aff. 2005, 24 (5): 1103-1107. 10.1377/hlthaff.24.5.1103.CrossRef
18.
go back to reference Carroll RJ, Eyler AE, Denny JC: Naïve electronic health record phenotype identification for rheumatoid arthritis. AMIA Annu Symp Proc. 2011, 2011: 189-196.PubMedPubMedCentral Carroll RJ, Eyler AE, Denny JC: Naïve electronic health record phenotype identification for rheumatoid arthritis. AMIA Annu Symp Proc. 2011, 2011: 189-196.PubMedPubMedCentral
19.
go back to reference Carroll RJ, Thompson WK, Eyler AE, Mandelin AM, Cai T, Zink RM, Pacheco JA, Boomershine CS, Lasko TA, Xu H, et al: Portability of an algorithm to identify rheumatoid arthritis in electronic health records. J Am Med Inform Assoc. 2012, 19: e162-e169. 10.1136/amiajnl-2011-000583.CrossRefPubMedPubMedCentral Carroll RJ, Thompson WK, Eyler AE, Mandelin AM, Cai T, Zink RM, Pacheco JA, Boomershine CS, Lasko TA, Xu H, et al: Portability of an algorithm to identify rheumatoid arthritis in electronic health records. J Am Med Inform Assoc. 2012, 19: e162-e169. 10.1136/amiajnl-2011-000583.CrossRefPubMedPubMedCentral
21.
go back to reference Tate AR, Martin AGR, Ali A, Cassell JA: Using free text information to explore how and when GPs code a diagnosis of ovarian cancer: an observational study using primary care records of patients with ovarian cancer. BMJ open. 2011, 1: e000025-10.1136/bmjopen-2010-000025.CrossRefPubMedPubMedCentral Tate AR, Martin AGR, Ali A, Cassell JA: Using free text information to explore how and when GPs code a diagnosis of ovarian cancer: an observational study using primary care records of patients with ovarian cancer. BMJ open. 2011, 1: e000025-10.1136/bmjopen-2010-000025.CrossRefPubMedPubMedCentral
22.
go back to reference Linsell L, Dawson J, Zondervan K, Randall T, Rose P, Carr A, Fitzpatrick R: Prospective study of elderly people comparing treatments following first primary care consultation for a symptomatic hip or knee. Fam Pract. 2005, 22 (1): 118-125.CrossRefPubMed Linsell L, Dawson J, Zondervan K, Randall T, Rose P, Carr A, Fitzpatrick R: Prospective study of elderly people comparing treatments following first primary care consultation for a symptomatic hip or knee. Fam Pract. 2005, 22 (1): 118-125.CrossRefPubMed
23.
go back to reference Hanauer DA, Englesbe MJ, Cowan Jr JA, Campbell DA: Informatics and the american college of surgeons national surgical quality improvement program: automated processes could replace manual record review. J Am Coll Surg. 2009, 208: 37-41. 10.1016/j.jamcollsurg.2008.08.030.CrossRefPubMed Hanauer DA, Englesbe MJ, Cowan Jr JA, Campbell DA: Informatics and the american college of surgeons national surgical quality improvement program: automated processes could replace manual record review. J Am Coll Surg. 2009, 208: 37-41. 10.1016/j.jamcollsurg.2008.08.030.CrossRefPubMed
24.
go back to reference Voorham J, Denig P: Computerized extraction of information on the quality of diabetes care from free text in electronic patient records of general practitioners. J Am Med Inform Assoc. 2007, 14 (3): 349-354. 10.1197/jamia.M2128.CrossRefPubMedPubMedCentral Voorham J, Denig P: Computerized extraction of information on the quality of diabetes care from free text in electronic patient records of general practitioners. J Am Med Inform Assoc. 2007, 14 (3): 349-354. 10.1197/jamia.M2128.CrossRefPubMedPubMedCentral
25.
go back to reference Hanauer DA, Miela G, Chinnaiyan AM, Chang AE, Blayney DW: The registry case finding engine: an automated tool to identify cancer cases from unstructured, free-text pathology reports and clinical notes. J Am CollSurg. 2007, 205 (5): 690-697. Hanauer DA, Miela G, Chinnaiyan AM, Chang AE, Blayney DW: The registry case finding engine: an automated tool to identify cancer cases from unstructured, free-text pathology reports and clinical notes. J Am CollSurg. 2007, 205 (5): 690-697.
26.
go back to reference Jordan K, Jinks C, Croft P: Health care utilisation: measurement using primary care records and patient recall both showed bias. J Clin Epidemiol. 2006, 59: 791-797. 10.1016/j.jclinepi.2005.12.008.CrossRefPubMed Jordan K, Jinks C, Croft P: Health care utilisation: measurement using primary care records and patient recall both showed bias. J Clin Epidemiol. 2006, 59: 791-797. 10.1016/j.jclinepi.2005.12.008.CrossRefPubMed
27.
go back to reference Pakhomov SS, Hemingway H, Weston SA, Jacobsen SJ, Rodeheffer R, Roger VL: Epidemiology of angina pectoris: role of natural language processing of the medical record. Am Heart J. 2007, 153 (4): 666-673. 10.1016/j.ahj.2006.12.022.CrossRefPubMedPubMedCentral Pakhomov SS, Hemingway H, Weston SA, Jacobsen SJ, Rodeheffer R, Roger VL: Epidemiology of angina pectoris: role of natural language processing of the medical record. Am Heart J. 2007, 153 (4): 666-673. 10.1016/j.ahj.2006.12.022.CrossRefPubMedPubMedCentral
28.
go back to reference DeLisle S, South B, Anthony JA, Kalp E, Gundlapallli A, Curriero FC, Glass GE, Samore M, Perl TM: Combining free text and structured electronic medical record entries to detect acute respiratory infections. PLoS ONE. 2010, 5 (10): e13377-10.1371/journal.pone.0013377.CrossRefPubMedPubMedCentral DeLisle S, South B, Anthony JA, Kalp E, Gundlapallli A, Curriero FC, Glass GE, Samore M, Perl TM: Combining free text and structured electronic medical record entries to detect acute respiratory infections. PLoS ONE. 2010, 5 (10): e13377-10.1371/journal.pone.0013377.CrossRefPubMedPubMedCentral
29.
go back to reference Liao KP, Cai T, Gainer V, Goryachev S, Zeng-Treitler Q, Raychaudhuri S, Szolovits P, Churchill S, Murphy S, Kohane I, et al: Electronic medical records for discovery research in rheumatoid arthritis. Arthritis Care Res (Hoboken). 2010, 62 (8): 1120-1127. 10.1002/acr.20184.CrossRef Liao KP, Cai T, Gainer V, Goryachev S, Zeng-Treitler Q, Raychaudhuri S, Szolovits P, Churchill S, Murphy S, Kohane I, et al: Electronic medical records for discovery research in rheumatoid arthritis. Arthritis Care Res (Hoboken). 2010, 62 (8): 1120-1127. 10.1002/acr.20184.CrossRef
30.
go back to reference Koeling R, Tate AR, Carroll JA: Automatically estimating the incidence of symptoms recording in GP free text notes. Proceedings of MIXHS’11 2011. 2011, Glasgow, Scotland, UK Koeling R, Tate AR, Carroll JA: Automatically estimating the incidence of symptoms recording in GP free text notes. Proceedings of MIXHS’11 2011. 2011, Glasgow, Scotland, UK
31.
go back to reference Greenhalgh T, Swinglehurst D: Studying technology use as social practice: the untapped potential of ethnography. BMC Med. 2011, 9 (1): 45-10.1186/1741-7015-9-45.CrossRefPubMedPubMedCentral Greenhalgh T, Swinglehurst D: Studying technology use as social practice: the untapped potential of ethnography. BMC Med. 2011, 9 (1): 45-10.1186/1741-7015-9-45.CrossRefPubMedPubMedCentral
32.
go back to reference Swinglehurst D, Greenhalgh T, Myall M, Russell J: Ethnographic study of ICT-supported collaborative work routines in general practice. BMC Health Serv Res. 2010, 10 (1): 348-10.1186/1472-6963-10-348.CrossRefPubMedPubMedCentral Swinglehurst D, Greenhalgh T, Myall M, Russell J: Ethnographic study of ICT-supported collaborative work routines in general practice. BMC Health Serv Res. 2010, 10 (1): 348-10.1186/1472-6963-10-348.CrossRefPubMedPubMedCentral
33.
go back to reference Rosenbloom ST, Denny JC, Xu H, Lorenzi NM, Stead WW, Johnson KB: Data from clinical notes: a perspective on the tension between structure and flexible documentation. J Am Med Inform Assoc. 2011, 18: 181-186. 10.1136/jamia.2010.007237.CrossRefPubMedPubMedCentral Rosenbloom ST, Denny JC, Xu H, Lorenzi NM, Stead WW, Johnson KB: Data from clinical notes: a perspective on the tension between structure and flexible documentation. J Am Med Inform Assoc. 2011, 18: 181-186. 10.1136/jamia.2010.007237.CrossRefPubMedPubMedCentral
34.
go back to reference Zheng K, Hanauer DA, Padman R, Johnson MP, Hussain AA, Ye W, Zhou X, Diamond HS: Handling anticipated exceptions in clinical care: investigating clinical use of ‘exit strategies’ in an electronic health records system. J Am Med Inform Assoc. 2011, 18: 883-889. 10.1136/amiajnl-2011-000118.CrossRefPubMedPubMedCentral Zheng K, Hanauer DA, Padman R, Johnson MP, Hussain AA, Ye W, Zhou X, Diamond HS: Handling anticipated exceptions in clinical care: investigating clinical use of ‘exit strategies’ in an electronic health records system. J Am Med Inform Assoc. 2011, 18: 883-889. 10.1136/amiajnl-2011-000118.CrossRefPubMedPubMedCentral
35.
go back to reference Carroll J, Koeling R, Puri S: Lexical aquisition for clinical text mining using distributional similarity. Proceedings of the 13th International Conference on Text Processing and Computational Linguistics (CICLing): 2012; IIT. 2012, Delhi, India: Springer Lecture Notes in Computer Science, 232-246. Carroll J, Koeling R, Puri S: Lexical aquisition for clinical text mining using distributional similarity. Proceedings of the 13th International Conference on Text Processing and Computational Linguistics (CICLing): 2012; IIT. 2012, Delhi, India: Springer Lecture Notes in Computer Science, 232-246.
Metadata
Title
Optimising the use of electronic health records to estimate the incidence of rheumatoid arthritis in primary care: what information is hidden in free text?
Publication date
01-12-2013
Published in
BMC Medical Research Methodology / Issue 1/2013
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/1471-2288-13-105

Other articles of this Issue 1/2013

BMC Medical Research Methodology 1/2013 Go to the issue