Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2019

Open Access 01-12-2019 | Research article

Building a tobacco user registry by extracting multiple smoking behaviors from clinical notes

Authors: Ellen L. Palmer, Saeed Hassanpour, John Higgins, Jennifer A. Doherty, Tracy Onega

Published in: BMC Medical Informatics and Decision Making | Issue 1/2019

Login to get access

Abstract

Background

Usage of structured fields in Electronic Health Records (EHRs) to ascertain smoking history is important but fails in capturing the nuances of smoking behaviors. Knowledge of smoking behaviors, such as pack year history and most recent cessation date, allows care providers to select the best care plan for patients at risk of smoking attributable diseases.

Methods

We developed and evaluated a health informatics pipeline for identifying complete smoking history from clinical notes in EHRs. We utilized 758 patient-visit notes (from visits between 03/28/2016 and 04/04/2016) from our local EHR in addition to a public dataset of 502 clinical notes from the 2006 i2b2 Challenge to assess the performance of this pipeline. We used a machine-learning classifier to extract smoking status and a comprehensive set of text processing regular expressions to extract pack years and cessation date information from these clinical notes.

Results

We identified smoking status with an F1 score of 0.90 on both the i2b2 and local data sets. Regular expression identification of pack year history in the local test set was 91.7% sensitive and 95.2% specific, but due to variable context the pack year extraction was incomplete in 25% of cases, extracting packs per day or years smoked only. Regular expression identification of cessation date was 63.2% sensitive and 94.6% specific.

Conclusions

Our work indicates that the development of an EHR-based Smokers’ Registry containing information relating to smoking behaviors, not just status, from free-text clinical notes using an informatics pipeline is feasible. This pipeline is capable of functioning in external EHRs, reducing the amount of time and money needed at the institute-level to create a Smokers’ Registry for improved identification of patient risk and eligibility for preventative and early detection services.
Appendix
Available only for authorised users
Literature
1.
go back to reference Warren GW, Alberg AJ, Kraft AS, Cummings KM. The 2014 surgeon General's report: "the health consequences of smoking--50 years of progress": a paradigm shift in cancer care. Cancer. 2014;120(13):1914–6.CrossRef Warren GW, Alberg AJ, Kraft AS, Cummings KM. The 2014 surgeon General's report: "the health consequences of smoking--50 years of progress": a paradigm shift in cancer care. Cancer. 2014;120(13):1914–6.CrossRef
3.
go back to reference Ito H, Matsuo K, Hamajima N, Mitsudomi T, Sugiura T, Saito T, et al. Gene-environment interactions between the smoking habit and polymorphisms in the DNA repair genes, APE1 Asp148Glu and XRCC1 Arg399Gln, in Japanese lung cancer risk. Carcinogenesis. 2004;25(8):1395–401.CrossRef Ito H, Matsuo K, Hamajima N, Mitsudomi T, Sugiura T, Saito T, et al. Gene-environment interactions between the smoking habit and polymorphisms in the DNA repair genes, APE1 Asp148Glu and XRCC1 Arg399Gln, in Japanese lung cancer risk. Carcinogenesis. 2004;25(8):1395–401.CrossRef
4.
go back to reference Rostron B. Smoking-attributable mortality by cause in the United States: revising the CDC’s data and estimates. Nicotine Tob Res. 2013;15(1):238–46.CrossRef Rostron B. Smoking-attributable mortality by cause in the United States: revising the CDC’s data and estimates. Nicotine Tob Res. 2013;15(1):238–46.CrossRef
5.
go back to reference Centers for Disease C, Prevention. Behavioral Risk Factor Surveillance System Survey Data. Atlanta, Georgia: U.S. Centers for Disease C, Prevention. Behavioral Risk Factor Surveillance System Survey Data. Atlanta, Georgia: U.S.
6.
go back to reference Jamal A, King BA, Neff LJ, Whitmill J, Babb SD, Graffunder CM. Current cigarette smoking among adults - United States, 2005-2015. MMWR Morb Mortal Wkly Rep. 2016;65(44):1205–11.CrossRef Jamal A, King BA, Neff LJ, Whitmill J, Babb SD, Graffunder CM. Current cigarette smoking among adults - United States, 2005-2015. MMWR Morb Mortal Wkly Rep. 2016;65(44):1205–11.CrossRef
7.
go back to reference Park ER, Gareen IF, Japuntich S, Lennes I, Hyland K, DeMello S, et al. Primary care provider-delivered smoking cessation interventions and smoking cessation among participants in the National Lung Screening Trial. JAMA Intern Med. 2015;175(9):1509–16.CrossRef Park ER, Gareen IF, Japuntich S, Lennes I, Hyland K, DeMello S, et al. Primary care provider-delivered smoking cessation interventions and smoking cessation among participants in the National Lung Screening Trial. JAMA Intern Med. 2015;175(9):1509–16.CrossRef
8.
go back to reference Accountable Care Organization Preventative Measures. Accountable Care Organization Preventative Measures.
9.
go back to reference Boyle R, Solberg L, Fiore M. Use of electronic health records to support smoking cessation. Cochrane Database Syst Rev. 2014;12:CD008743. Boyle R, Solberg L, Fiore M. Use of electronic health records to support smoking cessation. Cochrane Database Syst Rev. 2014;12:CD008743.
10.
go back to reference Wender R, Fontham ETH, Barrera E, Colditz GA, Church TR, Ettinger DS, et al. American Cancer Society lung cancer screening guidelines. CA Cancer J Clin. 2013;63(2):106–17.CrossRef Wender R, Fontham ETH, Barrera E, Colditz GA, Church TR, Ettinger DS, et al. American Cancer Society lung cancer screening guidelines. CA Cancer J Clin. 2013;63(2):106–17.CrossRef
11.
go back to reference Cohen AM. Five-way smoking status classification using text hot-spot identification and error-correcting output codes. J Am Med Inform Assoc. 2008;15(1):32–5.CrossRef Cohen AM. Five-way smoking status classification using text hot-spot identification and error-correcting output codes. J Am Med Inform Assoc. 2008;15(1):32–5.CrossRef
12.
go back to reference Uzuner O, Goldstein I, Luo Y, Kohane I. Identifying patient smoking status from medical discharge records. J Am Med Inform Assoc. 2008;15(1):14–24.CrossRef Uzuner O, Goldstein I, Luo Y, Kohane I. Identifying patient smoking status from medical discharge records. J Am Med Inform Assoc. 2008;15(1):14–24.CrossRef
13.
go back to reference Leng J, Shen S, Gundlapalli A, South B, Editors. The extensible human Oracle suite of tools (eHOST) for annotation of clinical narratives. American medical informatics association spring congress; Phoenix,AZ. Leng J, Shen S, Gundlapalli A, South B, Editors. The extensible human Oracle suite of tools (eHOST) for annotation of clinical narratives. American medical informatics association spring congress; Phoenix,AZ.
14.
go back to reference Foundation PS. Python Language Reference, version 2.7. Foundation PS. Python Language Reference, version 2.7.
15.
go back to reference Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825–30. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825–30.
18.
go back to reference Wang Y, Chen ES, Pakhomov S, Lindemann E, Melton GB, editors. Investigating Longitudinal Tobacco Use Information from Social History and Clinical Notes in the Electronic Health Record. AMIA National Meeting; 2016. Wang Y, Chen ES, Pakhomov S, Lindemann E, Melton GB, editors. Investigating Longitudinal Tobacco Use Information from Social History and Clinical Notes in the Electronic Health Record. AMIA National Meeting; 2016.
19.
go back to reference Thompson B, Rich LE, Lynn WR, Shields R, Corle D. A voluntary Smokers' registry: characteristics of joiners and non-joiners in the community intervention trial for smoking cessation (COMMIT). Am J Public Health. 1998;88(1):100–3.CrossRef Thompson B, Rich LE, Lynn WR, Shields R, Corle D. A voluntary Smokers' registry: characteristics of joiners and non-joiners in the community intervention trial for smoking cessation (COMMIT). Am J Public Health. 1998;88(1):100–3.CrossRef
20.
go back to reference Le Faou AL, Baha M, Rodon N, Lagrue G, Ménard J. Trends in the profile of smokers registered in a national database from 2001 to 2006: changes in smoking habits. Public Health. 2009;123(1):6–11.CrossRef Le Faou AL, Baha M, Rodon N, Lagrue G, Ménard J. Trends in the profile of smokers registered in a national database from 2001 to 2006: changes in smoking habits. Public Health. 2009;123(1):6–11.CrossRef
21.
go back to reference Kells M, Rogers J, Oppenheimer SC, Blaine K, McCabe M, McGrath E, et al. The teachable moment captured: a framework for nurse-led smoking cessation interventions for parents of hospitalized children. Public Health Nurs. 2013;30(5):468–73.CrossRef Kells M, Rogers J, Oppenheimer SC, Blaine K, McCabe M, McGrath E, et al. The teachable moment captured: a framework for nurse-led smoking cessation interventions for parents of hospitalized children. Public Health Nurs. 2013;30(5):468–73.CrossRef
22.
go back to reference Mussulman LM, Faseru B, Fitzgerald S, Nazir N, Patel V, Richter KP. A randomized, controlled pilot study of warm handoff versus fax referral for hospital-initiated smoking cessation among people living with HIV/AIDS. Addict Behav. 2018;78:205–8.CrossRef Mussulman LM, Faseru B, Fitzgerald S, Nazir N, Patel V, Richter KP. A randomized, controlled pilot study of warm handoff versus fax referral for hospital-initiated smoking cessation among people living with HIV/AIDS. Addict Behav. 2018;78:205–8.CrossRef
23.
go back to reference Richter KP, Faseru B, Shireman TI, Mussulman LM, Nazir N, Bush T, et al. Warm handoff versus fax referral for linking hospitalized smokers to Quitlines. Am J Prev Med. 2016;51(4):587–96.CrossRef Richter KP, Faseru B, Shireman TI, Mussulman LM, Nazir N, Bush T, et al. Warm handoff versus fax referral for linking hospitalized smokers to Quitlines. Am J Prev Med. 2016;51(4):587–96.CrossRef
24.
go back to reference Zalmanovitch Y, Vashdi DR. The relationship between socio-economic factors and responsiveness gaps in primary, preventative and health promotion services. Health Expect. 2015;18(6):2638–50.CrossRef Zalmanovitch Y, Vashdi DR. The relationship between socio-economic factors and responsiveness gaps in primary, preventative and health promotion services. Health Expect. 2015;18(6):2638–50.CrossRef
25.
go back to reference Raz DJ, Dunham R, Tiep B, Sandoval A, Grannis F, Rotter A, et al. Augmented meaningful use criteria to identify patients eligible for lung cancer screening. Ann Thorac Surg. 2014;98(3):996–1002.CrossRef Raz DJ, Dunham R, Tiep B, Sandoval A, Grannis F, Rotter A, et al. Augmented meaningful use criteria to identify patients eligible for lung cancer screening. Ann Thorac Surg. 2014;98(3):996–1002.CrossRef
26.
go back to reference Bach PB, Mirkin JN, Oliver TK, Azzoli CG, Berry DA, Brawley OW, et al. Benefits and harms of CT screening for lung cancer: a systematic review. JAMA. 2012;307(22):2418–29.CrossRef Bach PB, Mirkin JN, Oliver TK, Azzoli CG, Berry DA, Brawley OW, et al. Benefits and harms of CT screening for lung cancer: a systematic review. JAMA. 2012;307(22):2418–29.CrossRef
27.
go back to reference Tramontano AC, Sheehan DF, McMahon PM, Dowling EC, Holford TR, Ryczak K, et al. Evaluating the impacts of screening and smoking cessation programmes on lung cancer in a high-burden region of the USA: a simulation modelling study. BMJ Open. 2016;6(2):e010227.CrossRef Tramontano AC, Sheehan DF, McMahon PM, Dowling EC, Holford TR, Ryczak K, et al. Evaluating the impacts of screening and smoking cessation programmes on lung cancer in a high-burden region of the USA: a simulation modelling study. BMJ Open. 2016;6(2):e010227.CrossRef
28.
go back to reference Fucito LM, Czabafy S, Hendricks PS, Kotsen C, Richardson D, Toll BA. Pairing smoking-cessation services with lung cancer screening: a clinical guideline from the Association for the Treatment of tobacco use and dependence and the Society for Research on nicotine and tobacco. Cancer. 2016. Fucito LM, Czabafy S, Hendricks PS, Kotsen C, Richardson D, Toll BA. Pairing smoking-cessation services with lung cancer screening: a clinical guideline from the Association for the Treatment of tobacco use and dependence and the Society for Research on nicotine and tobacco. Cancer. 2016.
29.
go back to reference Szatkowski L, McNeill A, Lewis S, Coleman T. A comparison of patient recall of smoking cessation advice with advice recorded in electronic medical records. BMC Public Health. 2011;11(1):291.CrossRef Szatkowski L, McNeill A, Lewis S, Coleman T. A comparison of patient recall of smoking cessation advice with advice recorded in electronic medical records. BMC Public Health. 2011;11(1):291.CrossRef
30.
go back to reference Blumenthal DS. Barriers to the provision of smoking cessation services reported by clinicians in underserved communities. J Am Board Fam Med. 2007;20(3):272–9.CrossRef Blumenthal DS. Barriers to the provision of smoking cessation services reported by clinicians in underserved communities. J Am Board Fam Med. 2007;20(3):272–9.CrossRef
31.
go back to reference Cromwell J, Bartosch WJ, Fiore MC, Hasselblad V, Baker T. Cost-effectiveness of the clinical practice recommendations in the AHCPR guideline for smoking cessation. Agency for Health Care Policy and Research. JAMA. 1997;278(21):1759–66.CrossRef Cromwell J, Bartosch WJ, Fiore MC, Hasselblad V, Baker T. Cost-effectiveness of the clinical practice recommendations in the AHCPR guideline for smoking cessation. Agency for Health Care Policy and Research. JAMA. 1997;278(21):1759–66.CrossRef
32.
go back to reference Rodgers A, Corbett T, Bramley D, Riddell T, Wills M, Lin RB, et al. Do u smoke after txt? Results of a randomised trial of smoking cessation using mobile phone text messaging. Tob Control. 2005;14(4):255–61.CrossRef Rodgers A, Corbett T, Bramley D, Riddell T, Wills M, Lin RB, et al. Do u smoke after txt? Results of a randomised trial of smoking cessation using mobile phone text messaging. Tob Control. 2005;14(4):255–61.CrossRef
33.
go back to reference Haas JS, Linder JA, Park ER, Gonzalez I, Rigotti NA, Klinger EV, et al. Proactive tobacco cessation outreach to smokers of low socioeconomic status: a randomized clinical trial. JAMA Intern Med. 2015;175(2):218–26.CrossRef Haas JS, Linder JA, Park ER, Gonzalez I, Rigotti NA, Klinger EV, et al. Proactive tobacco cessation outreach to smokers of low socioeconomic status: a randomized clinical trial. JAMA Intern Med. 2015;175(2):218–26.CrossRef
34.
go back to reference Williams JM, Steinberg ML, Griffiths KG, Cooperman N. Smokers with behavioral health comorbidity should be designated a tobacco use disparity group. Am J Public Health. 2013;103(9):1549–55.CrossRef Williams JM, Steinberg ML, Griffiths KG, Cooperman N. Smokers with behavioral health comorbidity should be designated a tobacco use disparity group. Am J Public Health. 2013;103(9):1549–55.CrossRef
35.
go back to reference Baker TB, Piper ME, McCarthy DE, Bolt DM, Smith SS, Kim S-Y, et al. Time to first cigarette in the morning as an index of ability to quit smoking: implications for nicotine dependence. Nicotine & tobacco research : Nicotine Tob Res. 2007;9 Suppl 4(December):S570. Baker TB, Piper ME, McCarthy DE, Bolt DM, Smith SS, Kim S-Y, et al. Time to first cigarette in the morning as an index of ability to quit smoking: implications for nicotine dependence. Nicotine & tobacco research : Nicotine Tob Res. 2007;9 Suppl 4(December):S570.
36.
go back to reference Czarnecki KD, Goranson C, Ja E, Vichinsky LE, Coady MH, Perl SB. Using geographic information system analyses to monitor large-scale distribution of nicotine replacement therapy in New York city. Prev Med. 2010;50(5–6):288–96.CrossRef Czarnecki KD, Goranson C, Ja E, Vichinsky LE, Coady MH, Perl SB. Using geographic information system analyses to monitor large-scale distribution of nicotine replacement therapy in New York city. Prev Med. 2010;50(5–6):288–96.CrossRef
37.
go back to reference Miller T, Va R, SaM G, Hattis D, Rundle A, Andrews H, et al. The economic impact of early life environmental tobacco smoke exposure: early intervention for developmental delay. Environ Health Perspect. 2006;114(10):1585–8.CrossRef Miller T, Va R, SaM G, Hattis D, Rundle A, Andrews H, et al. The economic impact of early life environmental tobacco smoke exposure: early intervention for developmental delay. Environ Health Perspect. 2006;114(10):1585–8.CrossRef
38.
go back to reference Solberg LI, Flottemesch TJ, Foldes SS, Molitor BA, Walker PF, Crain AL. Tobacco-use prevalence in special populations. Taking advantage of electronic medical records. Am J Prev Med. 2008;35(6 SUPPL):S501–7.CrossRef Solberg LI, Flottemesch TJ, Foldes SS, Molitor BA, Walker PF, Crain AL. Tobacco-use prevalence in special populations. Taking advantage of electronic medical records. Am J Prev Med. 2008;35(6 SUPPL):S501–7.CrossRef
39.
go back to reference Boudet C, Zmirou D, Vestri V. Can one use ambient air concentration data to estimate personal and population exposures to particles? An approach within the European EXPOLIS study. Sci Total Environ. 2001;267(1–3):141–50.CrossRef Boudet C, Zmirou D, Vestri V. Can one use ambient air concentration data to estimate personal and population exposures to particles? An approach within the European EXPOLIS study. Sci Total Environ. 2001;267(1–3):141–50.CrossRef
40.
go back to reference Sockrider MM, Hudmon KS, Addy R, Dolan MP. An exploratory study of control of smoking in the home to reduce infant exposure to environmental tobacco smoke. Nicotine Tob Res. 2003;5(6):901–10.CrossRef Sockrider MM, Hudmon KS, Addy R, Dolan MP. An exploratory study of control of smoking in the home to reduce infant exposure to environmental tobacco smoke. Nicotine Tob Res. 2003;5(6):901–10.CrossRef
41.
go back to reference Joya X, Manzano C, Álvarez A-T, Mercadal M, Torres F, Salat-Batlle J, et al. Transgenerational exposure to environmental tobacco smoke. Int J Environ Res Public Health. 2014;11(7):7261–74.CrossRef Joya X, Manzano C, Álvarez A-T, Mercadal M, Torres F, Salat-Batlle J, et al. Transgenerational exposure to environmental tobacco smoke. Int J Environ Res Public Health. 2014;11(7):7261–74.CrossRef
42.
go back to reference Emmons KM, Hammond SK, Fava JL, Velicer WF, Evans JL. Monroe aD. A randomized trial to reduce passive smoke exposure in low-income households with young children. Pediatrics. 2001;108(1):18–24.CrossRef Emmons KM, Hammond SK, Fava JL, Velicer WF, Evans JL. Monroe aD. A randomized trial to reduce passive smoke exposure in low-income households with young children. Pediatrics. 2001;108(1):18–24.CrossRef
43.
go back to reference Fu SS, van Ryn M, Nelson D, Burgess DJ, Thomas JL, Saul J, et al. Proactive tobacco treatment offering free nicotine replacement therapy and telephone counselling for socioeconomically disadvantaged smokers: a randomised clinical trial. Thorax. 2016;71(5):446–53.CrossRef Fu SS, van Ryn M, Nelson D, Burgess DJ, Thomas JL, Saul J, et al. Proactive tobacco treatment offering free nicotine replacement therapy and telephone counselling for socioeconomically disadvantaged smokers: a randomised clinical trial. Thorax. 2016;71(5):446–53.CrossRef
44.
go back to reference Hazlehurst B, Sittig DF, Stevens VJ, Smith KS, Hollis JF, Vogt TM, et al. Natural language processing in the electronic medical record: assessing clinician adherence to tobacco treatment guidelines. Am J Prev Med. 2005;29(5):434–9.CrossRef Hazlehurst B, Sittig DF, Stevens VJ, Smith KS, Hollis JF, Vogt TM, et al. Natural language processing in the electronic medical record: assessing clinician adherence to tobacco treatment guidelines. Am J Prev Med. 2005;29(5):434–9.CrossRef
45.
go back to reference Oza S, Thun MJ, Henley SJ, Lopez AD, Ezzati M. How many deaths are attributable to smoking in the United States? Comparison of methods for estimating smoking-attributable mortality when smoking prevalence changes. Prev Med. 2011;52(6):428–33.CrossRef Oza S, Thun MJ, Henley SJ, Lopez AD, Ezzati M. How many deaths are attributable to smoking in the United States? Comparison of methods for estimating smoking-attributable mortality when smoking prevalence changes. Prev Med. 2011;52(6):428–33.CrossRef
46.
go back to reference John DA, Kawachi I, Lathan CS, Ayanian JZ. Disparities in perceived unmet need for supportive services among patients with lung cancer in the Cancer care outcomes research and surveillance consortium. Cancer. 2014;120(20):3178–91.CrossRef John DA, Kawachi I, Lathan CS, Ayanian JZ. Disparities in perceived unmet need for supportive services among patients with lung cancer in the Cancer care outcomes research and surveillance consortium. Cancer. 2014;120(20):3178–91.CrossRef
47.
go back to reference Tsou AY, Lehmann CU, Michel J, Solomon R, Possznaz L, Gandhi T. Safe practices for copy and paste in the EHR. Systematic review, recommendations, and novel model for health IT collaboration. Applied Clinical Informatics. 2017;8:12–34.PubMedPubMedCentral Tsou AY, Lehmann CU, Michel J, Solomon R, Possznaz L, Gandhi T. Safe practices for copy and paste in the EHR. Systematic review, recommendations, and novel model for health IT collaboration. Applied Clinical Informatics. 2017;8:12–34.PubMedPubMedCentral
48.
go back to reference Modin HE, Fathi JT, Gilbert CR, Wilshire CL, Wilson AK, Aye RW, et al. Pack-year cigarette smoking history for determination of lung Cancer screening eligibility. Comparison of the electronic medical record versus a shared decision-making conversation. Ann Am Thorac Soc. 2017;14(8):1320–5.CrossRef Modin HE, Fathi JT, Gilbert CR, Wilshire CL, Wilson AK, Aye RW, et al. Pack-year cigarette smoking history for determination of lung Cancer screening eligibility. Comparison of the electronic medical record versus a shared decision-making conversation. Ann Am Thorac Soc. 2017;14(8):1320–5.CrossRef
49.
go back to reference Frick AP, Martin SG, Shwartz M. Case-mix and cost differences between teaching and nonteaching hospitals. Med Care. 1985;23(4):283–95.CrossRef Frick AP, Martin SG, Shwartz M. Case-mix and cost differences between teaching and nonteaching hospitals. Med Care. 1985;23(4):283–95.CrossRef
50.
go back to reference Hripcsak G, Duke JD, Shah NH, Reich CG, Huser V, Schuemie MJ, et al. Observational health data sciences and informatics (OHDSI): opportunities for observational researchers. Stud Health Technol Inform. 2015;216:574–58.PubMedPubMedCentral Hripcsak G, Duke JD, Shah NH, Reich CG, Huser V, Schuemie MJ, et al. Observational health data sciences and informatics (OHDSI): opportunities for observational researchers. Stud Health Technol Inform. 2015;216:574–58.PubMedPubMedCentral
Metadata
Title
Building a tobacco user registry by extracting multiple smoking behaviors from clinical notes
Authors
Ellen L. Palmer
Saeed Hassanpour
John Higgins
Jennifer A. Doherty
Tracy Onega
Publication date
01-12-2019
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2019
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/s12911-019-0863-3

Other articles of this Issue 1/2019

BMC Medical Informatics and Decision Making 1/2019 Go to the issue