Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2012

Open Access 01-12-2012 | Research article

Identification of pneumonia and influenza deaths using the death certificate pipeline

Authors: Kailah Davis, Catherine Staes, Jeff Duncan, Sean Igo, Julio C Facelli

Published in: BMC Medical Informatics and Decision Making | Issue 1/2012

Login to get access

Abstract

Background

Death records are a rich source of data, which can be used to assist with public surveillance and/or decision support. However, to use this type of data for such purposes it has to be transformed into a coded format to make it computable. Because the cause of death in the certificates is reported as free text, encoding the data is currently the single largest barrier of using death certificates for surveillance. Therefore, the purpose of this study was to demonstrate the feasibility of using a pipeline, composed of a detection rule and a natural language processor, for the real time encoding of death certificates using the identification of pneumonia and influenza cases as an example and demonstrating that its accuracy is comparable to existing methods.

Results

A Death Certificates Pipeline (DCP) was developed to automatically code death certificates and identify pneumonia and influenza cases. The pipeline used MetaMap to code death certificates from the Utah Department of Health for the year 2008. The output of MetaMap was then accessed by detection rules which flagged pneumonia and influenza cases based on the Centers of Disease and Control and Prevention (CDC) case definition. The output from the DCP was compared with the current method used by the CDC and with a keyword search. Recall, precision, positive predictive value and F-measure with respect to the CDC method were calculated for the two other methods considered here. The two different techniques compared here with the CDC method showed the following recall/ precision results: DCP: 0.998/0.98 and keyword searching: 0.96/0.96. The F-measure were 0.99 and 0.96 respectively (DCP and keyword searching). Both the keyword and the DCP can run in interactive form with modest computer resources, but DCP showed superior performance.

Conclusion

The pipeline proposed here for coding death certificates and the detection of cases is feasible and can be extended to other conditions. This method provides an alternative that allows for coding free-text death certificates in real time that may increase its utilization not only in the public health domain but also for biomedical researchers and developers.

Trial Registration

This study did not involved any clinical trials.
Appendix
Available only for authorised users
Literature
2.
go back to reference Declich S, Carter AO: Public health surveillance: historical origins, methods and evaluation. Bull World Health Organ. 1994, 72: 285-304.PubMedPubMedCentral Declich S, Carter AO: Public health surveillance: historical origins, methods and evaluation. Bull World Health Organ. 1994, 72: 285-304.PubMedPubMedCentral
3.
go back to reference Galbraith NS: Communicable disease surveillance. Recent advances in community medicine, No 2. Edited by: Smith A. 1982, Churchill Livingstone, London, 127-142. Galbraith NS: Communicable disease surveillance. Recent advances in community medicine, No 2. Edited by: Smith A. 1982, Churchill Livingstone, London, 127-142.
5.
go back to reference Sartorius B, Jacobsen H, Törner A, Giesecke J: Description of a new all cause mortality surveillance system in Sweden as a warning system using threshold detection algorithms. Eur J Epidemiol. 2006, 21: 181-9. 10.1007/s10654-005-5923-6.CrossRefPubMed Sartorius B, Jacobsen H, Törner A, Giesecke J: Description of a new all cause mortality surveillance system in Sweden as a warning system using threshold detection algorithms. Eur J Epidemiol. 2006, 21: 181-9. 10.1007/s10654-005-5923-6.CrossRefPubMed
6.
go back to reference Simonsen L, Clarke MJ, Stroup DF, Williamson GD, Arden NH, Cox NJ: A method for timely assessment of influenza-associated mortality in the United States. Epidemiology. 1997, 8: 390-5. 10.1097/00001648-199707000-00007.CrossRefPubMed Simonsen L, Clarke MJ, Stroup DF, Williamson GD, Arden NH, Cox NJ: A method for timely assessment of influenza-associated mortality in the United States. Epidemiology. 1997, 8: 390-5. 10.1097/00001648-199707000-00007.CrossRefPubMed
7.
go back to reference Haskey J: Mortality surveillance 1968–1976, England and Wales. Deaths and rates by sex and age group for 8th revision causes, A-list and chapters. 1978, Great Britain Office of Population Census and Surveys, Medical Statistics Division, Crown, London Haskey J: Mortality surveillance 1968–1976, England and Wales. Deaths and rates by sex and age group for 8th revision causes, A-list and chapters. 1978, Great Britain Office of Population Census and Surveys, Medical Statistics Division, Crown, London
9.
go back to reference Fiore AE, Shay DK, Broder K: Prevention and control of seasonal influenza with vaccines: recommendations of the Advisory Committee on Immunization Practices (ACIP),2009. MMWR Recomm Rep. 2009, 58: 1-52. [Erratum, MMWR Recomm Rep 2009, 58:896–7.]PubMed Fiore AE, Shay DK, Broder K: Prevention and control of seasonal influenza with vaccines: recommendations of the Advisory Committee on Immunization Practices (ACIP),2009. MMWR Recomm Rep. 2009, 58: 1-52. [Erratum, MMWR Recomm Rep 2009, 58:896–7.]PubMed
10.
go back to reference Hall MJ, DeFrances CJ, Williams SN, Golosinskiy A, Schwartzman A: National Hospital Discharge Survey: 2007 summary. Natl Health Stat Report. 2010, 26: 1-20. Hall MJ, DeFrances CJ, Williams SN, Golosinskiy A, Schwartzman A: National Hospital Discharge Survey: 2007 summary. Natl Health Stat Report. 2010, 26: 1-20.
11.
go back to reference Thompson WW, Shay DK, Weintraub E: Influenza-associated hospitalizations in the United States. JAMA. 2004, 292: 1333-40. 10.1001/jama.292.11.1333.CrossRefPubMed Thompson WW, Shay DK, Weintraub E: Influenza-associated hospitalizations in the United States. JAMA. 2004, 292: 1333-40. 10.1001/jama.292.11.1333.CrossRefPubMed
13.
go back to reference Postma M, Bos JM, Van Gennep M, Jager JC, Baltussen R, Sprenger MJW: Economic evaluation of influenza vaccination. Assessment for The Netherlands. Pharmacoeconomics. 1999, 16 (suppl1): 33-40.CrossRefPubMed Postma M, Bos JM, Van Gennep M, Jager JC, Baltussen R, Sprenger MJW: Economic evaluation of influenza vaccination. Assessment for The Netherlands. Pharmacoeconomics. 1999, 16 (suppl1): 33-40.CrossRefPubMed
14.
go back to reference Thompson WW, Shay DK, Weintraub E, Brammer L, Cox N, Anderson LJ, Fukuda K: Mortality associated with influenza and respiratory syncytial virus in the United States. JAMA. 2003, 289: 179-86. 10.1001/jama.289.2.179.CrossRefPubMed Thompson WW, Shay DK, Weintraub E, Brammer L, Cox N, Anderson LJ, Fukuda K: Mortality associated with influenza and respiratory syncytial virus in the United States. JAMA. 2003, 289: 179-86. 10.1001/jama.289.2.179.CrossRefPubMed
19.
go back to reference Muscatello DJ, Morton PM, Evans I, Gilmour R: Prospective surveillance of excess mortality due to influenza in New South Wales: feasibility and statistical approach. Commun Dis Intell. 2008, 32: 435-42. Muscatello DJ, Morton PM, Evans I, Gilmour R: Prospective surveillance of excess mortality due to influenza in New South Wales: feasibility and statistical approach. Commun Dis Intell. 2008, 32: 435-42.
20.
go back to reference Sager N: Medical Language Processing: Computer Management of Narrative Data. 1997, Springer-Verlag, New York Sager N: Medical Language Processing: Computer Management of Narrative Data. 1997, Springer-Verlag, New York
22.
go back to reference Anderson RN, Miniño AM, Hoyert DL, Rosenberg HM: Comparability of cause of death between ICD-9 and ICD-10: preliminary estimates. Natl Vital Stat Rep. 2001, 49: 1-32. Anderson RN, Miniño AM, Hoyert DL, Rosenberg HM: Comparability of cause of death between ICD-9 and ICD-10: preliminary estimates. Natl Vital Stat Rep. 2001, 49: 1-32.
23.
go back to reference World Health Organization: International Statistical Classification of Disease and Related Health Problems, Tenth Revision Version for 2007 (ICD-10). 2006 World Health Organization: International Statistical Classification of Disease and Related Health Problems, Tenth Revision Version for 2007 (ICD-10). 2006
24.
go back to reference Harris K: Selected data editing procedures in an automated multiple cause of death coding system. Proceedings of the Conference of European Statistics: 2–4 June. 1999, , Rome Harris K: Selected data editing procedures in an automated multiple cause of death coding system. Proceedings of the Conference of European Statistics: 2–4 June. 1999, , Rome
25.
go back to reference Riedl B, Than N, Hogarth M: Using the UMLS and Simple Statistical Methods to Semantically Categorize Causes of Death on Death Certificates. AMIA Annual Symposium Proceeding 13 Nov. 2010, , Washington D.C.2010:677–81 Riedl B, Than N, Hogarth M: Using the UMLS and Simple Statistical Methods to Semantically Categorize Causes of Death on Death Certificates. AMIA Annual Symposium Proceeding 13 Nov. 2010, , Washington D.C.2010:677–81
26.
go back to reference Glenn D: Description of the National Center for Health Statistics Software Systems and Demonstrations. Proceedings of the international collaborative effort on automating mortality Volume I:July. 1999, National Center of Health Statistics, Hyattsville, Maryland Glenn D: Description of the National Center for Health Statistics Software Systems and Demonstrations. Proceedings of the international collaborative effort on automating mortality Volume I:July. 1999, National Center of Health Statistics, Hyattsville, Maryland
27.
go back to reference Toward an electronic death registration system in the United States: report of the Steering Committee to Reengineer the Death Registration Process. Am J Forensic Med Pathol. 1998, 19: 234-41. 10.1097/00000433-199809000-00007.CrossRef Toward an electronic death registration system in the United States: report of the Steering Committee to Reengineer the Death Registration Process. Am J Forensic Med Pathol. 1998, 19: 234-41. 10.1097/00000433-199809000-00007.CrossRef
29.
go back to reference Chapman WW, Christensen LM, Wagner MM, Haug PJ, Ivanov O, Dowling JN: Classifying free-text triage chief complaints into syndromic categories with natural language processing. Artif Intell Med. 2005, 33: 31-40. 10.1016/j.artmed.2004.04.001.CrossRefPubMed Chapman WW, Christensen LM, Wagner MM, Haug PJ, Ivanov O, Dowling JN: Classifying free-text triage chief complaints into syndromic categories with natural language processing. Artif Intell Med. 2005, 33: 31-40. 10.1016/j.artmed.2004.04.001.CrossRefPubMed
30.
go back to reference Fiszman M, Chapman WW, Aronsky D, Evans RS, Haug PJ: Automatic detection of acute bacterial pneumonia from chest X-ray reports. J Am Med Inform Assoc. 2000, 7: 593-604. 10.1136/jamia.2000.0070593.CrossRefPubMedPubMedCentral Fiszman M, Chapman WW, Aronsky D, Evans RS, Haug PJ: Automatic detection of acute bacterial pneumonia from chest X-ray reports. J Am Med Inform Assoc. 2000, 7: 593-604. 10.1136/jamia.2000.0070593.CrossRefPubMedPubMedCentral
31.
go back to reference Friedman C, Shagina L, Lussier Y, Hripcsak G: Automated encoding of clinical documents based on natural language processing. J Am Med Inform Assoc. 2004, 11: 392-402. 10.1197/jamia.M1552.CrossRefPubMedPubMedCentral Friedman C, Shagina L, Lussier Y, Hripcsak G: Automated encoding of clinical documents based on natural language processing. J Am Med Inform Assoc. 2004, 11: 392-402. 10.1197/jamia.M1552.CrossRefPubMedPubMedCentral
32.
go back to reference Gundlapalli AV, South BR, Chapman WW, Phansalkar S, Shen S, Delisle S, Perl Trish, Samore MH: Using NLP on VA Electronic Medical Records to Facilitate Epidemiologic Case Investigations. Advances in Disease Surveillance. 2008, 5: 34- Gundlapalli AV, South BR, Chapman WW, Phansalkar S, Shen S, Delisle S, Perl Trish, Samore MH: Using NLP on VA Electronic Medical Records to Facilitate Epidemiologic Case Investigations. Advances in Disease Surveillance. 2008, 5: 34-
33.
go back to reference Chapman WW, Dowling JN, Ivanov O, Gesteland PH, Espino JU, Wagner MM: Evaluating natural language processing applications applied to outbreak and disease surveillance. Proc 36thSymposium on the Interface. 2004, Computing Science and Statistics, Baltimore Chapman WW, Dowling JN, Ivanov O, Gesteland PH, Espino JU, Wagner MM: Evaluating natural language processing applications applied to outbreak and disease surveillance. Proc 36thSymposium on the Interface. 2004, Computing Science and Statistics, Baltimore
34.
go back to reference Aronson A: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proc AMIA Symp. 2001, 17-21. Aronson A: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proc AMIA Symp. 2001, 17-21.
35.
go back to reference Crowell J, Zeng Q, Ngo L, Lacroix EM: A Frequency-based technique to improve the spelling suggestion rank in medical queries. J Am Med Inform Assoc. 2004, 11: 179-185. 10.1197/jamia.M1474.CrossRefPubMedPubMedCentral Crowell J, Zeng Q, Ngo L, Lacroix EM: A Frequency-based technique to improve the spelling suggestion rank in medical queries. J Am Med Inform Assoc. 2004, 11: 179-185. 10.1197/jamia.M1474.CrossRefPubMedPubMedCentral
36.
go back to reference Browne AC, Divita G, Lu C, McCreedy L, Nace D: TECHNICAL REPORT LHNCBC-TR-2003–003, Lexical Systems: A report to the Board of Scientific Counselors. Lister Hill National Center for Biomedical Communications, National Library of Medicine. 2003 Browne AC, Divita G, Lu C, McCreedy L, Nace D: TECHNICAL REPORT LHNCBC-TR-2003–003, Lexical Systems: A report to the Board of Scientific Counselors. Lister Hill National Center for Biomedical Communications, National Library of Medicine. 2003
37.
go back to reference Browne AC, Divita G, Aronson AR, McCray AT: UMLS language and vocabulary tools. 2003, Proceedings of the AMIA Annual Symposium, Washington DC, USA, 798- Browne AC, Divita G, Aronson AR, McCray AT: UMLS language and vocabulary tools. 2003, Proceedings of the AMIA Annual Symposium, Washington DC, USA, 798-
40.
42.
go back to reference R Core Development Team: R: A language and environment for statistical computing. 2009, R Foundation for Statistical Computing, Vienna R Core Development Team: R: A language and environment for statistical computing. 2009, R Foundation for Statistical Computing, Vienna
45.
go back to reference Whittle J: Community-acquired pneumonia: can it be defined with claims data?. American Journal of Medical Quality. 1997, 12: 187-193. 10.1177/0885713X9701200404.CrossRefPubMed Whittle J: Community-acquired pneumonia: can it be defined with claims data?. American Journal of Medical Quality. 1997, 12: 187-193. 10.1177/0885713X9701200404.CrossRefPubMed
46.
go back to reference Skull SA, Andrews RM, Byrnes GB, Campbell DA, Nolan TM, Brown GV, Kelly HA: ICD- 10 codes are a valid tool for identification of pneumonia in hospitalized patients aged > or = 65 years. Epidemiol Infect. 2008, 136: 232-40. Epub 2007 Apr 20CrossRefPubMed Skull SA, Andrews RM, Byrnes GB, Campbell DA, Nolan TM, Brown GV, Kelly HA: ICD- 10 codes are a valid tool for identification of pneumonia in hospitalized patients aged > or = 65 years. Epidemiol Infect. 2008, 136: 232-40. Epub 2007 Apr 20CrossRefPubMed
Metadata
Title
Identification of pneumonia and influenza deaths using the death certificate pipeline
Authors
Kailah Davis
Catherine Staes
Jeff Duncan
Sean Igo
Julio C Facelli
Publication date
01-12-2012
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2012
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/1472-6947-12-37

Other articles of this Issue 1/2012

BMC Medical Informatics and Decision Making 1/2012 Go to the issue