Skip to main content
Top
Published in: BMC Health Services Research 1/2010

Open Access 01-12-2010 | Research article

Empirical aspects of record linkage across multiple data sets using statistical linkage keys: the experience of the PIAC cohort study

Authors: Rosemary Karmel, Phil Anderson, Diane Gibson, Ann Peut, Stephen Duckett, Yvonne Wells

Published in: BMC Health Services Research | Issue 1/2010

Login to get access

Abstract

Background

In Australia, many community service program data collections developed over the last decade, including several for aged care programs, contain a statistical linkage key (SLK) to enable derivation of client-level data. In addition, a common SLK is now used in many collections to facilitate the statistical examination of cross-program use. In 2005, the Pathways in Aged Care (PIAC) cohort study was funded to create a linked aged care database using the common SLK to enable analysis of pathways through aged care services.
Linkage using an SLK is commonly deterministic. The purpose of this paper is to describe an extended deterministic record linkage strategy for situations where there is a general person identifier (e.g. an SLK) and several additional variables suitable for data linkage. This approach can allow for variation in client information recorded on different databases.

Methods

A stepwise deterministic record linkage algorithm was developed to link datasets using an SLK and several other variables. Three measures of likely match accuracy were used: the discriminating power of match key values, an estimated false match rate, and an estimated step-specific trade-off between true and false matches. The method was validated through examining link properties and clerical review of three samples of links.

Results

The deterministic algorithm resulted in up to an 11% increase in links compared with simple deterministic matching using an SLK. The links identified are of high quality: validation samples showed that less than 0.5% of links were false positives, and very few matches were made using non-unique match information (0.01%). There was a high degree of consistency in the characteristics of linked events.

Conclusions

The linkage strategy described in this paper has allowed the linking of multiple large aged care service datasets using a statistical linkage key while allowing for variation in its reporting. More widely, our deterministic algorithm, based on statistical properties of match keys, is a useful addition to the linker's toolkit. In particular, it may prove attractive when insufficient data are available for clerical review or follow-up, and the researcher has fewer options in relation to probabilistic linkage.
Appendix
Available only for authorised users
Literature
2.
go back to reference Aged Care Assessment Program National Data Repository: Aged Care Assessment Program National Data Repository: minimum data set report, Annual Report 2003-04. 2005, Melbourne: La Trobe University Aged Care Assessment Program National Data Repository: Aged Care Assessment Program National Data Repository: minimum data set report, Annual Report 2003-04. 2005, Melbourne: La Trobe University
3.
go back to reference Australian Institute of Health and Welfare (AIHW): Disability support services provided under the Commonwealth/State Disability Agreement, national data 1999. Canberra. 2000 Australian Institute of Health and Welfare (AIHW): Disability support services provided under the Commonwealth/State Disability Agreement, national data 1999. Canberra. 2000
4.
go back to reference Ryan T, Holmes B, Gibson D: A National Minimum Data Set for Home and Community Care. 1999, Canberra: AIHW, 76. Ryan T, Holmes B, Gibson D: A National Minimum Data Set for Home and Community Care. 1999, Canberra: AIHW, 76.
5.
go back to reference National Health and Medical Research Council (NHMRC), Australian Research Council, Australian Vice-Chancellors' Committee: National Statement on Ethical Conduct in Human Research. 2007, Canberra: NHMRC National Health and Medical Research Council (NHMRC), Australian Research Council, Australian Vice-Chancellors' Committee: National Statement on Ethical Conduct in Human Research. 2007, Canberra: NHMRC
6.
go back to reference Fellegi IP, Sunter AB: A Theory for Record Linkage. Journal of the American Statistical Association. 1969, 1183-1210. 10.2307/2286061. 64 Fellegi IP, Sunter AB: A Theory for Record Linkage. Journal of the American Statistical Association. 1969, 1183-1210. 10.2307/2286061. 64
7.
go back to reference Jaro M: Probabilistic linkage of large public health data files. Statistics in Medicine. 1995, 14: 112-121. 10.1002/sim.4780140510.CrossRef Jaro M: Probabilistic linkage of large public health data files. Statistics in Medicine. 1995, 14: 112-121. 10.1002/sim.4780140510.CrossRef
9.
go back to reference Ascential Software Corporation: WebSphere® QualityStage Version 8: User guide. 2006, IBM Ascential Software Corporation: WebSphere® QualityStage Version 8: User guide. 2006, IBM
11.
go back to reference Herzog T, Scheuren FJ, Winkler WE: Data quality and record linkage techniques. 2007, New York: Springer Herzog T, Scheuren FJ, Winkler WE: Data quality and record linkage techniques. 2007, New York: Springer
12.
go back to reference Meray N, Reitsma JB, Ravelli ACJ, Bonsel GJ: Probabilistic record linkage is a valid and transparent tool to combine databases without a patient identification number. Journal of Clinical Epidemiology. 2007, 60: 883-891. 10.1016/j.jclinepi.2006.11.021.CrossRefPubMed Meray N, Reitsma JB, Ravelli ACJ, Bonsel GJ: Probabilistic record linkage is a valid and transparent tool to combine databases without a patient identification number. Journal of Clinical Epidemiology. 2007, 60: 883-891. 10.1016/j.jclinepi.2006.11.021.CrossRefPubMed
13.
go back to reference National Community Services Information Management Group: Statistical data linkage in community services data collections: a report prepared by the Statistical Linkage Key Working Group. 2004, Canberra: AIHW National Community Services Information Management Group: Statistical data linkage in community services data collections: a report prepared by the Statistical Linkage Key Working Group. 2004, Canberra: AIHW
14.
go back to reference Gray L: Two year review of aged care reforms. 2001, Canberra: DHAC Gray L: Two year review of aged care reforms. 2001, Canberra: DHAC
15.
go back to reference Hales C, Peut A: Ageing and aged care. Australia's welfare 2007. Edited by: Mathur S, Gibson D. 2007, Canberra: AIHW, table 3.23. Hales C, Peut A: Ageing and aged care. Australia's welfare 2007. Edited by: Mathur S, Gibson D. 2007, Canberra: AIHW, table 3.23.
16.
go back to reference AIHW: Residential aged care in Australia 2004-05: a statistical overview. 2006, Canberra: AIHW AIHW: Residential aged care in Australia 2004-05: a statistical overview. 2006, Canberra: AIHW
17.
go back to reference AIHW: Community Aged Care Packages in Australia 2004-05. 2006, Canberra: AIHW AIHW: Community Aged Care Packages in Australia 2004-05. 2006, Canberra: AIHW
18.
go back to reference Aged Care Assessment Program National Data Repository: Aged Care Assessment Program National Data Repository: minimum data set report, Annual Report 2004-05. 2006, Melbourne: La Trobe University Aged Care Assessment Program National Data Repository: Aged Care Assessment Program National Data Repository: minimum data set report, Annual Report 2004-05. 2006, Melbourne: La Trobe University
20.
go back to reference Ryan T, Holmes B, Gibson D: A National Minimum Data Set for Home and Community Care. 1999, Canberra: AIHW Ryan T, Holmes B, Gibson D: A National Minimum Data Set for Home and Community Care. 1999, Canberra: AIHW
21.
go back to reference Karmel R: Data linkage protocols using a statistical linkage key. 2005, Canberra: AIHW Karmel R: Data linkage protocols using a statistical linkage key. 2005, Canberra: AIHW
22.
go back to reference Karmel R: Transitions between aged care services. 2005, Canberra: AIHW Karmel R: Transitions between aged care services. 2005, Canberra: AIHW
24.
go back to reference Karmel R, Gibson D: Event-based record linkage in health and aged care services data: a methodological innovation. BMC Health Services Research. 2007, 7: 154-10.1186/1472-6963-7-154.CrossRefPubMedPubMedCentral Karmel R, Gibson D: Event-based record linkage in health and aged care services data: a methodological innovation. BMC Health Services Research. 2007, 7: 154-10.1186/1472-6963-7-154.CrossRefPubMedPubMedCentral
25.
go back to reference Gibson D, Liu Z: Aged care. Australia's Welfare 1993: services and assistance. Edited by: Choi C, Foard G, Gibson D, Madden R, Vaughan G. 1993, Canberra: AIHW, 200-265. Gibson D, Liu Z: Aged care. Australia's Welfare 1993: services and assistance. Edited by: Choi C, Foard G, Gibson D, Madden R, Vaughan G. 1993, Canberra: AIHW, 200-265.
26.
go back to reference Gibson D, Holmes B, Liu Z: Aged care. Australia's Welfare 1999: services and assistance. Edited by: Choi C, Gibson D, Goss J, Griffin J, Madden R, Madden R, Maples J, Moyle H, Wilson D. 1999, Canberra: AIHW, 165-213. Gibson D, Holmes B, Liu Z: Aged care. Australia's Welfare 1999: services and assistance. Edited by: Choi C, Gibson D, Goss J, Griffin J, Madden R, Madden R, Maples J, Moyle H, Wilson D. 1999, Canberra: AIHW, 165-213.
27.
go back to reference Gibson D, Bowler E, Angus P, Braun P, Mason F: Aged care. Australia's Welfare 2001. Edited by: Madden R, Gibson D, Choi C, Maples J, Madden R. 2001, Canberra: AIHW, 199-257. Gibson D, Bowler E, Angus P, Braun P, Mason F: Aged care. Australia's Welfare 2001. Edited by: Madden R, Gibson D, Choi C, Maples J, Madden R. 2001, Canberra: AIHW, 199-257.
28.
go back to reference Karmel R, Jenkins A, Angus P, Bowler E, Braun P: Ageing and aged care. Australia's Welfare 2003. Edited by: Gibson D, Madden R, Stuer A. 2003, Canberra: AIHW, 275-329. Karmel R, Jenkins A, Angus P, Bowler E, Braun P: Ageing and aged care. Australia's Welfare 2003. Edited by: Gibson D, Madden R, Stuer A. 2003, Canberra: AIHW, 275-329.
29.
go back to reference Karmel R, Peut A, Bennett S, Hogan R: Ageing and aged care. Australia's Welfare 2005. Edited by: Gibson D, Abello R. 2005, Canberra: AIHW, 134-201. Karmel R, Peut A, Bennett S, Hogan R: Ageing and aged care. Australia's Welfare 2005. Edited by: Gibson D, Abello R. 2005, Canberra: AIHW, 134-201.
30.
go back to reference Hales C, Peut A: Ageing and Aged Care. Australia's Welfare 2007. Edited by: Mathur S, Gibson D. 2007, Canberra: AIHW, 77-152. Hales C, Peut A: Ageing and Aged Care. Australia's Welfare 2007. Edited by: Mathur S, Gibson D. 2007, Canberra: AIHW, 77-152.
Metadata
Title
Empirical aspects of record linkage across multiple data sets using statistical linkage keys: the experience of the PIAC cohort study
Authors
Rosemary Karmel
Phil Anderson
Diane Gibson
Ann Peut
Stephen Duckett
Yvonne Wells
Publication date
01-12-2010
Publisher
BioMed Central
Published in
BMC Health Services Research / Issue 1/2010
Electronic ISSN: 1472-6963
DOI
https://doi.org/10.1186/1472-6963-10-41

Other articles of this Issue 1/2010

BMC Health Services Research 1/2010 Go to the issue