Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2011

Open Access 01-12-2011 | Research article

Validation of de-identified record linkage to ascertain hospital admissions in a cohort study

Authors: Alison Beauchamp, Andrew M Tonkin, Helen Kelsall, Vijaya Sundararajan, Dallas R English, Lalitha Sundaresan, Rory Wolfe, Gavin Turrell, Graham G Giles, Anna Peeters

Published in: BMC Medical Research Methodology | Issue 1/2011

Login to get access

Abstract

Background

Cohort studies can provide valuable evidence of cause and effect relationships but are subject to loss of participants over time, limiting the validity of findings. Computerised record linkage offers a passive and ongoing method of obtaining health outcomes from existing routinely collected data sources. However, the quality of record linkage is reliant upon the availability and accuracy of common identifying variables. We sought to develop and validate a method for linking a cohort study to a state-wide hospital admissions dataset with limited availability of unique identifying variables.

Methods

A sample of 2000 participants from a cohort study (n = 41 514) was linked to a state-wide hospitalisations dataset in Victoria, Australia using the national health insurance (Medicare) number and demographic data as identifying variables. Availability of the health insurance number was limited in both datasets; therefore linkage was undertaken both with and without use of this number and agreement tested between both algorithms. Sensitivity was calculated for a sub-sample of 101 participants with a hospital admission confirmed by medical record review.

Results

Of the 2000 study participants, 85% were found to have a record in the hospitalisations dataset when the national health insurance number and sex were used as linkage variables and 92% when demographic details only were used. When agreement between the two methods was tested the disagreement fraction was 9%, mainly due to "false positive" links when demographic details only were used. A final algorithm that used multiple combinations of identifying variables resulted in a match proportion of 87%. Sensitivity of this final linkage was 95%.

Conclusions

High quality record linkage of cohort data with a hospitalisations dataset that has limited identifiers can be achieved using combinations of a national health insurance number and demographic data as identifying variables.
Appendix
Available only for authorised users
Literature
1.
go back to reference Jekel J, Katz D, Elmore J: Epidemiology, biostatistics and preventive medicine. 2001, Philadelphia: WB Saunders, second Jekel J, Katz D, Elmore J: Epidemiology, biostatistics and preventive medicine. 2001, Philadelphia: WB Saunders, second
2.
go back to reference Hamlin C: The history of methods of social epidemiology to 1965. Methods in social epidemiology. Edited by: Oakes, Kaufman. 2006, San Francisco: John Wiley & Sons, 40-1 Hamlin C: The history of methods of social epidemiology to 1965. Methods in social epidemiology. Edited by: Oakes, Kaufman. 2006, San Francisco: John Wiley & Sons, 40-1
3.
go back to reference Castelli W, Garrison R, Wilson P, Abbott R, Kalousdians T, Kannel W: Incidence of coronary heart disease and lipoprotein-cholesterol levels. JAMA. 1986, 256: 2835-8. 10.1001/jama.256.20.2835.CrossRefPubMed Castelli W, Garrison R, Wilson P, Abbott R, Kalousdians T, Kannel W: Incidence of coronary heart disease and lipoprotein-cholesterol levels. JAMA. 1986, 256: 2835-8. 10.1001/jama.256.20.2835.CrossRefPubMed
4.
go back to reference Hubert H, Feinleib M, McNamara P, Castelli W: Obesity as an independent risk factor for cardiovascular disease: a 26-year follow-up of participants in the Framingham Heart Study. Circulation. 1983, 67: 968-77.CrossRefPubMed Hubert H, Feinleib M, McNamara P, Castelli W: Obesity as an independent risk factor for cardiovascular disease: a 26-year follow-up of participants in the Framingham Heart Study. Circulation. 1983, 67: 968-77.CrossRefPubMed
5.
go back to reference Rittera P, Stewart A, Kaymazc H, Sobeld D, Blocke D, K L: Self-reports of health care utilization compared to provider records. J Clin Epidemiol. 2000, 54: 136-41. 10.1016/S0895-4356(00)00261-4.CrossRef Rittera P, Stewart A, Kaymazc H, Sobeld D, Blocke D, K L: Self-reports of health care utilization compared to provider records. J Clin Epidemiol. 2000, 54: 136-41. 10.1016/S0895-4356(00)00261-4.CrossRef
6.
go back to reference St Sauver J, Hagan P, Cha S, Bagniewski S, Mandreka J, Curoe A, et al: Agreement between patient reports of cardiovascular disease and patient medical records. Mayo Clin Proc. 2005, 80: 203-10. 10.4065/80.2.203.CrossRefPubMed St Sauver J, Hagan P, Cha S, Bagniewski S, Mandreka J, Curoe A, et al: Agreement between patient reports of cardiovascular disease and patient medical records. Mayo Clin Proc. 2005, 80: 203-10. 10.4065/80.2.203.CrossRefPubMed
7.
go back to reference Okuraa Y, Urbanb L, Mahoneyb D, Jacobsenc S, Rodeheffera R: Agreement between self-report questionnaires and medical record data was substantial for diabetes, hypertension, myocardial infarction and stroke but not for heart failure. J Clin Epidemiol. 2004, 57: 1096-103. 10.1016/j.jclinepi.2004.04.005.CrossRef Okuraa Y, Urbanb L, Mahoneyb D, Jacobsenc S, Rodeheffera R: Agreement between self-report questionnaires and medical record data was substantial for diabetes, hypertension, myocardial infarction and stroke but not for heart failure. J Clin Epidemiol. 2004, 57: 1096-103. 10.1016/j.jclinepi.2004.04.005.CrossRef
8.
go back to reference Newell S, Girgis A, Sanson-Fisher T, Savolainen N: The accuracy of self-reported health behaviors and risk factors relating to cancer and cardiovascular disease in the general population. Am J Prev Med. 1999, 17: 211-29. 10.1016/S0749-3797(99)00069-0.CrossRefPubMed Newell S, Girgis A, Sanson-Fisher T, Savolainen N: The accuracy of self-reported health behaviors and risk factors relating to cancer and cardiovascular disease in the general population. Am J Prev Med. 1999, 17: 211-29. 10.1016/S0749-3797(99)00069-0.CrossRefPubMed
9.
go back to reference Sinha S, Myint P, Luben R, Khaw K: Accuracy of death certification and hospital record linkage for identification of incident stroke. BMC Medical Research Methodology. 2008, 8: 74-10.1186/1471-2288-8-74.CrossRefPubMedPubMedCentral Sinha S, Myint P, Luben R, Khaw K: Accuracy of death certification and hospital record linkage for identification of incident stroke. BMC Medical Research Methodology. 2008, 8: 74-10.1186/1471-2288-8-74.CrossRefPubMedPubMedCentral
10.
go back to reference Heckbert S, Kooperberg C, Safford M, Psaty b, Hsia J, McTiernan A, et al: Comparison of self-report, hospital discharge codes, and adjudication of cardiovascular events in the Women's Health initiative. Am J Epidemiol. 2003, 160: 1152-8.CrossRef Heckbert S, Kooperberg C, Safford M, Psaty b, Hsia J, McTiernan A, et al: Comparison of self-report, hospital discharge codes, and adjudication of cardiovascular events in the Women's Health initiative. Am J Epidemiol. 2003, 160: 1152-8.CrossRef
11.
go back to reference Howe G: Use of computerized record linkage in cohort studies. Epidemiologic review. 1998, 20: 112-21.CrossRef Howe G: Use of computerized record linkage in cohort studies. Epidemiologic review. 1998, 20: 112-21.CrossRef
12.
go back to reference Blakely T, Salmond C: Probabilistic record linkage and a methods to calculate the positive predictive value. Int J Epidemiol. 2002, 31: 1246-52. 10.1093/ije/31.6.1246.CrossRefPubMed Blakely T, Salmond C: Probabilistic record linkage and a methods to calculate the positive predictive value. Int J Epidemiol. 2002, 31: 1246-52. 10.1093/ije/31.6.1246.CrossRefPubMed
13.
go back to reference Goldacre M, Abigold J, Seagroatt V, Yeates D: Cancer after cholecystectomy: record-linkage study. Br J Cancer. 2005, 92: 1307-9. 10.1038/sj.bjc.6602392.CrossRefPubMedPubMedCentral Goldacre M, Abigold J, Seagroatt V, Yeates D: Cancer after cholecystectomy: record-linkage study. Br J Cancer. 2005, 92: 1307-9. 10.1038/sj.bjc.6602392.CrossRefPubMedPubMedCentral
15.
go back to reference Pacheco A, Saraceni V, Tuboi S, Moulton L, Chaisson R, Calcalcante S, et al: Validation of a heirarchical deterministic record-linkage algorithm using data from two different cohorts of human immunodeficiency virus-infected persons and mortality databases in Brazil. Am J Epidemiol. 2008, 161: 1326-32. 10.1093/aje/kwn249.CrossRef Pacheco A, Saraceni V, Tuboi S, Moulton L, Chaisson R, Calcalcante S, et al: Validation of a heirarchical deterministic record-linkage algorithm using data from two different cohorts of human immunodeficiency virus-infected persons and mortality databases in Brazil. Am J Epidemiol. 2008, 161: 1326-32. 10.1093/aje/kwn249.CrossRef
16.
go back to reference West of Scotland Coronary Prevention Study Group: Computerised record linkage:compared with traditional patient follow-up methods in clinical trials and illustrated in a prospective epidemiological study. J Clin Epidemiol. 1995, 48: 1441-52. 10.1016/0895-4356(95)00530-7.CrossRef West of Scotland Coronary Prevention Study Group: Computerised record linkage:compared with traditional patient follow-up methods in clinical trials and illustrated in a prospective epidemiological study. J Clin Epidemiol. 1995, 48: 1441-52. 10.1016/0895-4356(95)00530-7.CrossRef
17.
go back to reference Smith M, Newcombe H: Accuracies of computer versus manual linkages of routine health records. Methods Inf Med. 1979, 18: 89-97.PubMed Smith M, Newcombe H: Accuracies of computer versus manual linkages of routine health records. Methods Inf Med. 1979, 18: 89-97.PubMed
18.
go back to reference Lie A, Wen S: Development of record linkage of hospital discharge data for the study of neonatal readmission. Chronic Dis Can. 2000, 20 (3): Lie A, Wen S: Development of record linkage of hospital discharge data for the study of neonatal readmission. Chronic Dis Can. 2000, 20 (3):
19.
20.
go back to reference Grannis S, Overhage J, McDonald C: Analysis of identifier performance using a deterministic linkage algorithm. AMIA 2002 Annual Symposium Proceedings. 2002 Grannis S, Overhage J, McDonald C: Analysis of identifier performance using a deterministic linkage algorithm. AMIA 2002 Annual Symposium Proceedings. 2002
21.
go back to reference Simon M, Mueller B, Deapen D, Copeland G: A comparison of record linkage yield for health research using different variable sets. Breast Cancer Res Treat. 2005, 89: 107-10. 10.1007/s10549-004-1475-9.CrossRefPubMed Simon M, Mueller B, Deapen D, Copeland G: A comparison of record linkage yield for health research using different variable sets. Breast Cancer Res Treat. 2005, 89: 107-10. 10.1007/s10549-004-1475-9.CrossRefPubMed
24.
go back to reference Sundararajan V, Bunker S, Begg S, Marshall R, McBurney H: Attendance rates and outcomes of cardiac rehabilitation in Victoria, 1998. MJA. 2004, 268-71. 180 Sundararajan V, Bunker S, Begg S, Marshall R, McBurney H: Attendance rates and outcomes of cardiac rehabilitation in Victoria, 1998. MJA. 2004, 268-71. 180
25.
go back to reference Slamowicz R, Erbas B, Sundararajan V, Dharmage S: Predictors of readmission after elective coronary artery bypass graft surgery. Aust Health Rev. 2008, 32: 677-83. 10.1071/AH080677.CrossRefPubMed Slamowicz R, Erbas B, Sundararajan V, Dharmage S: Predictors of readmission after elective coronary artery bypass graft surgery. Aust Health Rev. 2008, 32: 677-83. 10.1071/AH080677.CrossRefPubMed
26.
go back to reference Giles G, English D: The Melbourne Collaborative Cohort Study. 2002, International Agency for Research on Cancer IARC Scientific Publications Lyon, 156: 69e70- Giles G, English D: The Melbourne Collaborative Cohort Study. 2002, International Agency for Research on Cancer IARC Scientific Publications Lyon, 156: 69e70-
27.
go back to reference Health Information and Quality Authority: Recommendations for a Unique Health Identifier for Individuals in Ireland. 2009, Health Information and Quality Authority, Cork, Ireland Health Information and Quality Authority: Recommendations for a Unique Health Identifier for Individuals in Ireland. 2009, Health Information and Quality Authority, Cork, Ireland
28.
go back to reference Canadian Institute for Health Information: Unique identifiers for health services recipients in Canada. 2000, Ottowa: Canadian Institute for Health Information Canadian Institute for Health Information: Unique identifiers for health services recipients in Canada. 2000, Ottowa: Canadian Institute for Health Information
31.
go back to reference Muse A, Miki J, Smith P: Evaluating the quality of anonymous record linkage using deterministic procedures with the New York state aids registry and a hospital discharge file. Stat Med. 2007, 14: 499-509. 10.1002/sim.4780140511.CrossRef Muse A, Miki J, Smith P: Evaluating the quality of anonymous record linkage using deterministic procedures with the New York state aids registry and a hospital discharge file. Stat Med. 2007, 14: 499-509. 10.1002/sim.4780140511.CrossRef
32.
go back to reference Rothman K, Greenland S, Lash T: Modern Epidemiology. 2008, Philadelphia: Lippincott Williams & Wilkins Rothman K, Greenland S, Lash T: Modern Epidemiology. 2008, Philadelphia: Lippincott Williams & Wilkins
Metadata
Title
Validation of de-identified record linkage to ascertain hospital admissions in a cohort study
Authors
Alison Beauchamp
Andrew M Tonkin
Helen Kelsall
Vijaya Sundararajan
Dallas R English
Lalitha Sundaresan
Rory Wolfe
Gavin Turrell
Graham G Giles
Anna Peeters
Publication date
01-12-2011
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2011
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/1471-2288-11-42

Other articles of this Issue 1/2011

BMC Medical Research Methodology 1/2011 Go to the issue