Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2017

Open Access 01-12-2017 | Research article

Development of an algorithm for determining smoking status and behaviour over the life course from UK electronic primary care records

Authors: Mark D. Atkinson, Jonathan I. Kennedy, Ann John, Keir E. Lewis, Ronan A. Lyons, Sinead T. Brophy, on behalf of the DEMISTIFY Research Group

Published in: BMC Medical Informatics and Decision Making | Issue 1/2017

Login to get access

Abstract

Background

Patients’ smoking status is routinely collected by General Practitioners (GP) in UK primary health care. There is an abundance of Read codes pertaining to smoking, including those relating to smoking cessation therapy, prescription, and administration codes, in addition to the more regularly employed smoking status codes. Large databases of primary care data are increasingly used for epidemiological analysis; smoking status is an important covariate in many such analyses. However, the variable definition is rarely documented in the literature.

Methods

The Secure Anonymised Information Linkage (SAIL) databank is a repository for a national collection of person-based anonymised health and socio-economic administrative data in Wales, UK. An exploration of GP smoking status data from the SAIL databank was carried out to explore the range of codes available and how they could be used in the identification of different categories of smokers, ex-smokers and never smokers. An algorithm was developed which addresses inconsistencies and changes in smoking status recording across the life course and compared with recorded smoking status as recorded in the Welsh Health Survey (WHS), 2013 and 2014 at individual level. However, the WHS could not be regarded as a “gold standard” for validation.

Results

There were 6836 individuals in the linked dataset. Missing data were more common in GP records (6%) than in WHS (1.1%). Our algorithm assigns ex-smoker status to 34% of never-smokers, and detects 30% more smokers than are declared in the WHS data. When distinguishing between current smokers and non-smokers, the similarity between the WHS and GP data using the nearest date of comparison was κ = 0.78. When temporal conflicts had been accounted for, the similarity was κ = 0.64, showing the importance of addressing conflicts.

Conclusions

We present an algorithm for the identification of a patient’s smoking status using GP self-reported data. We have included sufficient details to allow others to replicate this work, thus increasing the standards of documentation within this research area and assessment of smoking status in routine data.
Appendix
Available only for authorised users
Literature
3.
go back to reference Lindblad M, Rodriguez LAG, Lagergren J. Body mass, tobacco and alcohol and risk of esophageal, gastric cardia, and gastric non-cardia adenocarcinoma among men and women in a nested case–control study. Cancer Causes Control. 2005;16(3):285–94. doi:10.1007/s10552-004-3485-7.CrossRefPubMed Lindblad M, Rodriguez LAG, Lagergren J. Body mass, tobacco and alcohol and risk of esophageal, gastric cardia, and gastric non-cardia adenocarcinoma among men and women in a nested case–control study. Cancer Causes Control. 2005;16(3):285–94. doi:10.​1007/​s10552-004-3485-7.CrossRefPubMed
5.
go back to reference Osborn DPJ, Levy G, Nazareth I, Petersen I, Islam A, King MB. Relative risk of cardiovascular and cancer mortality in people with severe mental illness from the United Kingdom’s General Practice Research Database. Arch Gen Psychiatry. 2007;64(2):242–9. doi:10.1001/archpsyc.64.2.242.CrossRefPubMed Osborn DPJ, Levy G, Nazareth I, Petersen I, Islam A, King MB. Relative risk of cardiovascular and cancer mortality in people with severe mental illness from the United Kingdom’s General Practice Research Database. Arch Gen Psychiatry. 2007;64(2):242–9. doi:10.​1001/​archpsyc.​64.​2.​242.CrossRefPubMed
7.
go back to reference Walters K, Rait G, Petersen I, Williams R, Nazareth I. Panic disorder and risk of new onset coronary heart disease, acute myocardial infarction, and cardiac mortality: cohort study using the general practice research database. Eur Heart J. 2008;29(24):2981–8. doi:10.1093/eurheartj/ehn477.CrossRefPubMed Walters K, Rait G, Petersen I, Williams R, Nazareth I. Panic disorder and risk of new onset coronary heart disease, acute myocardial infarction, and cardiac mortality: cohort study using the general practice research database. Eur Heart J. 2008;29(24):2981–8. doi:10.​1093/​eurheartj/​ehn477.CrossRefPubMed
8.
go back to reference Andersohn F, Waring M, Garbe E. Risk of ischemic stroke in patients with crohn’s disease: a population-based nested case–control study. Inflamm Bowel Dis. 2010;16(8):1387–92. doi:10.1002/ibd.21187.CrossRefPubMed Andersohn F, Waring M, Garbe E. Risk of ischemic stroke in patients with crohn’s disease: a population-based nested case–control study. Inflamm Bowel Dis. 2010;16(8):1387–92. doi:10.​1002/​ibd.​21187.CrossRefPubMed
11.
go back to reference Taggar JS, Coleman T, Lewis S, Szatkowski L. The impact of the Quality and Outcomes Framework (QOF) on the recording of smoking targets in primary care medical records: cross-sectional analyses from The Health Improvement Network (THIN) database. BMC Public Health. 2012;12. doi:10.1186/1471-2458-12-329. Taggar JS, Coleman T, Lewis S, Szatkowski L. The impact of the Quality and Outcomes Framework (QOF) on the recording of smoking targets in primary care medical records: cross-sectional analyses from The Health Improvement Network (THIN) database. BMC Public Health. 2012;12. doi:10.​1186/​1471-2458-12-329.
13.
go back to reference Simpson CR, Hippisley-Cox J, Sheikh A. Trends in the epidemiology of smoking recorded in UK general practice. Brit J Gen Pract. 2010;60 (572). doi:10.3399/bjgp10X483544. Simpson CR, Hippisley-Cox J, Sheikh A. Trends in the epidemiology of smoking recorded in UK general practice. Brit J Gen Pract. 2010;60 (572). doi:10.​3399/​bjgp10X483544.
14.
go back to reference Sharma A. Maximising quality and outcomes framework quality points. The QOF clinical domain. London: Radcliffe Publishing; 2011. Sharma A. Maximising quality and outcomes framework quality points. The QOF clinical domain. London: Radcliffe Publishing; 2011.
15.
go back to reference Langley TE, Szatkowski L, Gibson J, Huang Y, McNeill A, Coleman T, et al. Validation of the health improvement network (THIN) primary care database for monitoring prescriptions for smoking cessation medications. Pharmacoepidemiol Drug Saf. 2010;19(6):586–90. doi:10.1002/pds.1960.CrossRefPubMed Langley TE, Szatkowski L, Gibson J, Huang Y, McNeill A, Coleman T, et al. Validation of the health improvement network (THIN) primary care database for monitoring prescriptions for smoking cessation medications. Pharmacoepidemiol Drug Saf. 2010;19(6):586–90. doi:10.​1002/​pds.​1960.CrossRefPubMed
16.
go back to reference Booth HP, Prevost AT, Gulliford MC. Validity of smoking prevalence estimates from primary care electronic health records compared with national population survey data for England, 2007 to 2011. Pharmacoepidemiol Drug Saf. 2013;22(12):1357–61. doi:10.1002/pds.3537.CrossRefPubMed Booth HP, Prevost AT, Gulliford MC. Validity of smoking prevalence estimates from primary care electronic health records compared with national population survey data for England, 2007 to 2011. Pharmacoepidemiol Drug Saf. 2013;22(12):1357–61. doi:10.​1002/​pds.​3537.CrossRefPubMed
17.
go back to reference West RHP, Stead LStapleton J. Outcome criteria in smoking cessation trials: proposal for a common standard. Addiction. 2005;100:299–303.CrossRefPubMed West RHP, Stead LStapleton J. Outcome criteria in smoking cessation trials: proposal for a common standard. Addiction. 2005;100:299–303.CrossRefPubMed
18.
go back to reference Ford DV, Jones KH, Verplancke J-P, Lyons RA, John G, Brown G et al. The SAIL Databank: building a national architecture for e-health research and evaluation. BMC Health Services Research. 2009;9. doi:10.1186/1472-6963-9-157. Ford DV, Jones KH, Verplancke J-P, Lyons RA, John G, Brown G et al. The SAIL Databank: building a national architecture for e-health research and evaluation. BMC Health Services Research. 2009;9. doi:10.​1186/​1472-6963-9-157.
19.
go back to reference Lyons R, Jones K, John G, Brooks C, Verplancke J-P, Ford D, et al. The SAIL databank: linking multiple health and social care datasets. BMC Med Inform Decis Mak. 2009;9(1):3.CrossRefPubMedPubMedCentral Lyons R, Jones K, John G, Brooks C, Verplancke J-P, Ford D, et al. The SAIL databank: linking multiple health and social care datasets. BMC Med Inform Decis Mak. 2009;9(1):3.CrossRefPubMedPubMedCentral
22.
go back to reference NatCen Social Research. Welsh Health Survey, 2013. [data collection], UK Data Service. SN: 7632. UK Data Service; 2015. doi:10.5285/UKDA-SN-7632-1. Accessed 24 Dec 2016. NatCen Social Research. Welsh Health Survey, 2013. [data collection], UK Data Service. SN: 7632. UK Data Service; 2015. doi:10.​5285/​UKDA-SN-7632-1. Accessed 24 Dec 2016.
23.
go back to reference NatCen Social Research. WelshHealth Survey, 2014. [data collection]. UK Data Service. SN: 7841; 2015. NatCen Social Research. WelshHealth Survey, 2014. [data collection]. UK Data Service. SN: 7841; 2015.
24.
go back to reference Dhoul N, van Vlymen J, de Lusignan S. Quality of smoking data in GP computer systems in the UK. Inform Prim Care. 2006;14(4):242–5. Dhoul N, van Vlymen J, de Lusignan S. Quality of smoking data in GP computer systems in the UK. Inform Prim Care. 2006;14(4):242–5.
26.
go back to reference Altman D. Practical statistics for medical research. London: Chapman & Hall/CRC; 1991. Altman D. Practical statistics for medical research. London: Chapman & Hall/CRC; 1991.
28.
go back to reference Rees SaL KE. Pharmacological aids to smoking cessation. In: Lewis K, editor. Smoking cessation. Oxford respiratory medicine library. Oxford: Oxford University Press; 2010. p. 27–41. Rees SaL KE. Pharmacological aids to smoking cessation. In: Lewis K, editor. Smoking cessation. Oxford respiratory medicine library. Oxford: Oxford University Press; 2010. p. 27–41.
29.
go back to reference Lewis JD, Brensinger C. Agreement between GPRD smoking data: a survey of general practitioners and a Population-based survey. Pharmacoepidemiol Drug Saf. 2004;13(7):437–41. doi:10.1002/pds.902.CrossRefPubMed Lewis JD, Brensinger C. Agreement between GPRD smoking data: a survey of general practitioners and a Population-based survey. Pharmacoepidemiol Drug Saf. 2004;13(7):437–41. doi:10.​1002/​pds.​902.CrossRefPubMed
30.
go back to reference Denaxas SC, George J, Herrett E, Shah AD, Kalra D, Hingorani AD, et al. Data resource profile: cardiovascular disease research using linked bespoke studies and electronic health records (Caliber). Int J Epidemiol. 2012;41(6):1625–38. doi:10.1093/ije/dys188.CrossRefPubMedPubMedCentral Denaxas SC, George J, Herrett E, Shah AD, Kalra D, Hingorani AD, et al. Data resource profile: cardiovascular disease research using linked bespoke studies and electronic health records (Caliber). Int J Epidemiol. 2012;41(6):1625–38. doi:10.​1093/​ije/​dys188.CrossRefPubMedPubMedCentral
32.
go back to reference Langley TE, Szatkowski LC, Wythe S, Lewis SA. Can primary care data be used to monitor regional smoking prevalence? An analysis of The Health Improvement Network primary care data. BMC Public Health. 2011;11. doi:10.1186/1471-2458-11-773. Langley TE, Szatkowski LC, Wythe S, Lewis SA. Can primary care data be used to monitor regional smoking prevalence? An analysis of The Health Improvement Network primary care data. BMC Public Health. 2011;11. doi:10.​1186/​1471-2458-11-773.
33.
go back to reference Faulconer ER, de Lusignan S. An eight-step method for assessing diagnostic data quality in practice: chronic obstructive pulmonary disease as an exemplar. Inform Prim Care. 2004;12(4):243–54.PubMed Faulconer ER, de Lusignan S. An eight-step method for assessing diagnostic data quality in practice: chronic obstructive pulmonary disease as an exemplar. Inform Prim Care. 2004;12(4):243–54.PubMed
34.
go back to reference de Lusignan S, Chan T, Stevens P, O’Donoghue D, Hague N, Dzregah B, et al. Identifying patients with chronic kidney disease from general practice computer records. Fam Pract. 2005;22(3):234–41. doi:10.1093/fampra/cmi026.CrossRefPubMed de Lusignan S, Chan T, Stevens P, O’Donoghue D, Hague N, Dzregah B, et al. Identifying patients with chronic kidney disease from general practice computer records. Fam Pract. 2005;22(3):234–41. doi:10.​1093/​fampra/​cmi026.CrossRefPubMed
35.
go back to reference Gorber SC, Schofield-Harwitz S, Hardt J, Levasseur G, Tremblay M. The accuracy of self-reported smoking: a systematic review of the relationship between self-reported and cotinine-assessed smoking status. Nicotine Tob Res. 2009;11(1):12–24.CrossRef Gorber SC, Schofield-Harwitz S, Hardt J, Levasseur G, Tremblay M. The accuracy of self-reported smoking: a systematic review of the relationship between self-reported and cotinine-assessed smoking status. Nicotine Tob Res. 2009;11(1):12–24.CrossRef
36.
go back to reference Chisholm J. The read clinical classification. Br Med J. 1990;300(6732):1092.CrossRef Chisholm J. The read clinical classification. Br Med J. 1990;300(6732):1092.CrossRef
37.
go back to reference Jick H, Jick SS, Derby LE. Validation of information recorded on general-practitioner based computerized data resource in the United kingdom. Br Med J. 1991;302(6779):766–8.CrossRef Jick H, Jick SS, Derby LE. Validation of information recorded on general-practitioner based computerized data resource in the United kingdom. Br Med J. 1991;302(6779):766–8.CrossRef
38.
go back to reference Lewis JD, Schinnar R, Bilker WB, Wang X, Strom BL. Validation studies of the health improvement network (THIN) database for pharmacoepidemiology research. Pharmacoepidemiol Drug Saf. 2007;16(4):393–401. doi:10.1002/pds.1335.CrossRefPubMed Lewis JD, Schinnar R, Bilker WB, Wang X, Strom BL. Validation studies of the health improvement network (THIN) database for pharmacoepidemiology research. Pharmacoepidemiol Drug Saf. 2007;16(4):393–401. doi:10.​1002/​pds.​1335.CrossRefPubMed
40.
go back to reference Hippisley-Cox J, Stables D, Pringle M. QRESEARCH: a new general practice database for research. Inform Prim Care. 2004;12(1):49–50.PubMed Hippisley-Cox J, Stables D, Pringle M. QRESEARCH: a new general practice database for research. Inform Prim Care. 2004;12(1):49–50.PubMed
Metadata
Title
Development of an algorithm for determining smoking status and behaviour over the life course from UK electronic primary care records
Authors
Mark D. Atkinson
Jonathan I. Kennedy
Ann John
Keir E. Lewis
Ronan A. Lyons
Sinead T. Brophy
on behalf of the DEMISTIFY Research Group
Publication date
01-12-2017
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2017
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/s12911-016-0400-6

Other articles of this Issue 1/2017

BMC Medical Informatics and Decision Making 1/2017 Go to the issue