Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2014

Open Access 01-12-2014 | Correspondence

Technical challenges of providing record linkage services for research

Authors: James H Boyd, Sean M Randall, Anna M Ferrante, Jacqueline K Bauer, Adrian P Brown, James B Semmens

Published in: BMC Medical Informatics and Decision Making | Issue 1/2014

Login to get access

Abstract

Background

Record linkage techniques are widely used to enable health researchers to gain event based longitudinal information for entire populations. The task of record linkage is increasingly being undertaken by specialised linkage units (SLUs). In addition to the complexity of undertaking probabilistic record linkage, these units face additional technical challenges in providing record linkage ‘as a service’ for research. The extent of this functionality, and approaches to solving these issues, has had little focus in the record linkage literature. Few, if any, of the record linkage packages or systems currently used by SLUs include the full range of functions required.

Methods

This paper identifies and discusses some of the functions that are required or undertaken by SLUs in the provision of record linkage services. These include managing routine, on-going linkage; storing and handling changing data; handling different linkage scenarios; accommodating ever increasing datasets. Automated linkage processes are one way of ensuring consistency of results and scalability of service.

Results

Alternative solutions to some of these challenges are presented. By maintaining a full history of links, and storing pairwise information, many of the challenges around handling ‘open’ records, and providing automated managed extractions are solved. A number of these solutions were implemented as part of the development of the National Linkage System (NLS) by the Centre for Data Linkage (part of the Population Health Research Network) in Australia.

Conclusions

The demand for, and complexity of, linkage services is growing. This presents as a challenge to SLUs as they seek to service the varying needs of dozens of research projects annually. Linkage units need to be both flexible and scalable to meet this demand. It is hoped the solutions presented here can help mitigate these difficulties.
Appendix
Available only for authorised users
Literature
1.
go back to reference Brook EL, Rosman DL, Holman CDAJ: Public good through data linkage: measuring research outputs from the Western Australian Data Linkage System. Aust N Z J Public Health. 2008, 32 (1): 19-23. 10.1111/j.1753-6405.2008.00160.x.CrossRefPubMed Brook EL, Rosman DL, Holman CDAJ: Public good through data linkage: measuring research outputs from the Western Australian Data Linkage System. Aust N Z J Public Health. 2008, 32 (1): 19-23. 10.1111/j.1753-6405.2008.00160.x.CrossRefPubMed
2.
go back to reference Hall SE, Holman CDAJ, Finn J, Semmens JB: Improving the evidence base for promoting quality and equity of surgical care using population-based linkage of administrative health records. Int J Qual Health Care. 2005, 17: 375-381.CrossRef Hall SE, Holman CDAJ, Finn J, Semmens JB: Improving the evidence base for promoting quality and equity of surgical care using population-based linkage of administrative health records. Int J Qual Health Care. 2005, 17: 375-381.CrossRef
3.
go back to reference Boyd JH, Ferrante AM, O’Keefe CM, Bass AJ, Randall SM, Semmens JB: Data linkage infrastructure for cross-jurisdictional health-related research in Australia. BMC Health Serv Res. 2012, 12 (1): 480-10.1186/1472-6963-12-480.CrossRefPubMedPubMedCentral Boyd JH, Ferrante AM, O’Keefe CM, Bass AJ, Randall SM, Semmens JB: Data linkage infrastructure for cross-jurisdictional health-related research in Australia. BMC Health Serv Res. 2012, 12 (1): 480-10.1186/1472-6963-12-480.CrossRefPubMedPubMedCentral
4.
go back to reference Gill LE: OX-LINK: the oxford medical record linkage system. Record Linkage Techniques -- 1997. Edited by: Alvey W, Jamerson B. 1999, Washington DC: National Academy Press, 15-33. Gill LE: OX-LINK: the oxford medical record linkage system. Record Linkage Techniques -- 1997. Edited by: Alvey W, Jamerson B. 1999, Washington DC: National Academy Press, 15-33.
5.
go back to reference Ford DV, Jones KH, Verplancke JP, Lyons RA, John G, Brown G, Brooke CJ, Thompson S, Bodger O, Couch T, Leake K: The SAIL Databank: building a national architecture for e-health research and evaluation. BMC Health Services Research. 2009, 9 (1): 157-10.1186/1472-6963-9-157.CrossRefPubMedPubMedCentral Ford DV, Jones KH, Verplancke JP, Lyons RA, John G, Brown G, Brooke CJ, Thompson S, Bodger O, Couch T, Leake K: The SAIL Databank: building a national architecture for e-health research and evaluation. BMC Health Services Research. 2009, 9 (1): 157-10.1186/1472-6963-9-157.CrossRefPubMedPubMedCentral
6.
go back to reference Roos LL, Nicol JP: A research registry: uses, development, and accuracy. J Clin Epidemiol. 1999, 52 (1): 39-47. 10.1016/S0895-4356(98)00126-7.CrossRefPubMed Roos LL, Nicol JP: A research registry: uses, development, and accuracy. J Clin Epidemiol. 1999, 52 (1): 39-47. 10.1016/S0895-4356(98)00126-7.CrossRefPubMed
7.
go back to reference Kendrick S, Clarke J: The Scottish record linkage system. Health Bull. 1993, 51 (2): 72- Kendrick S, Clarke J: The Scottish record linkage system. Health Bull. 1993, 51 (2): 72-
8.
go back to reference OECD: Strengthening Health Information Infrastructure for Health Care Quality Governance: Good Practices, New Opportunities and Data Privacy Protection Challenges. 2013, OECD PublishingCrossRef OECD: Strengthening Health Information Infrastructure for Health Care Quality Governance: Good Practices, New Opportunities and Data Privacy Protection Challenges. 2013, OECD PublishingCrossRef
9.
go back to reference Ferrante A: The use of data-linkage methods in criminal justice research: a commentary on progress, problems and future possibilities. Curr Issues Crim Justice. 2009, 20 (3): 378-392. Ferrante A: The use of data-linkage methods in criminal justice research: a commentary on progress, problems and future possibilities. Curr Issues Crim Justice. 2009, 20 (3): 378-392.
10.
go back to reference Jutte DP, Roos LL, Brownell MD: Administrative record linkage as a tool for public health research. Annu Rev Public Health. 2011, 32: 91-108. 10.1146/annurev-publhealth-031210-100700.CrossRefPubMed Jutte DP, Roos LL, Brownell MD: Administrative record linkage as a tool for public health research. Annu Rev Public Health. 2011, 32: 91-108. 10.1146/annurev-publhealth-031210-100700.CrossRefPubMed
11.
go back to reference Kelman C, Bass J, Holman D: Research use of linked health data: a best practice protocol. Aust N Z J Public Health. 2002, 26 (3): 251-255. 10.1111/j.1467-842X.2002.tb00682.x.CrossRefPubMed Kelman C, Bass J, Holman D: Research use of linked health data: a best practice protocol. Aust N Z J Public Health. 2002, 26 (3): 251-255. 10.1111/j.1467-842X.2002.tb00682.x.CrossRefPubMed
12.
go back to reference Schnell R, Schnell T, Bachteler J, Reiher : Privacy-preserving record linkage using Bloom filters. BMC Med Inform Decis Mak. 2009, 9 (1): 41-10.1186/1472-6947-9-41.CrossRefPubMedPubMedCentral Schnell R, Schnell T, Bachteler J, Reiher : Privacy-preserving record linkage using Bloom filters. BMC Med Inform Decis Mak. 2009, 9 (1): 41-10.1186/1472-6947-9-41.CrossRefPubMedPubMedCentral
13.
go back to reference Roos L, Wajda A: Record linkage strategies. Part I: estimating information and evaluating approaches. Methods Inf Med. 1991, 30 (2): 117-PubMed Roos L, Wajda A: Record linkage strategies. Part I: estimating information and evaluating approaches. Methods Inf Med. 1991, 30 (2): 117-PubMed
14.
go back to reference Hernández MA, Stolfo SJ: Real-world data is dirty: data cleansing and the merge/purge problem. Data Min Knowl Discov. 1998, 2 (1): 9-37. 10.1023/A:1009761603038.CrossRef Hernández MA, Stolfo SJ: Real-world data is dirty: data cleansing and the merge/purge problem. Data Min Knowl Discov. 1998, 2 (1): 9-37. 10.1023/A:1009761603038.CrossRef
15.
go back to reference Fellegi I, Sunter A: A theory for record linkage. J Am Stat Assoc. 1969, 64: 1183-1210. 10.1080/01621459.1969.10501049.CrossRef Fellegi I, Sunter A: A theory for record linkage. J Am Stat Assoc. 1969, 64: 1183-1210. 10.1080/01621459.1969.10501049.CrossRef
16.
go back to reference Newcombe H, Kennedy J: Record linkage: making maximum use of the discriminating power of identifying information. Commun ACM. 1962, 5 (11): 563-566. 10.1145/368996.369026.CrossRef Newcombe H, Kennedy J: Record linkage: making maximum use of the discriminating power of identifying information. Commun ACM. 1962, 5 (11): 563-566. 10.1145/368996.369026.CrossRef
17.
go back to reference Clark DE, Hahn DR: Comparison of probabilistic and deterministic record linkage in the development of a statewide trauma registry. Proc Annu Symp Comput Appl Med Care. 1995, 1995: 397-401. Clark DE, Hahn DR: Comparison of probabilistic and deterministic record linkage in the development of a statewide trauma registry. Proc Annu Symp Comput Appl Med Care. 1995, 1995: 397-401.
18.
go back to reference Pinder R, Chong N: Record linkage for registries: current approaches and innovative applications. Presentation to the North American Association of Central Cancer Registries Informatics Workshop. Toronto, Canada. 2002 Pinder R, Chong N: Record linkage for registries: current approaches and innovative applications. Presentation to the North American Association of Central Cancer Registries Informatics Workshop. Toronto, Canada. 2002
19.
go back to reference Gomatam S, Carter R, Ariet M, Mitchell G: An empirical comparison of record linkage procedures. Stat Med. 2002, 21: 1485-1496. 10.1002/sim.1147.CrossRefPubMed Gomatam S, Carter R, Ariet M, Mitchell G: An empirical comparison of record linkage procedures. Stat Med. 2002, 21: 1485-1496. 10.1002/sim.1147.CrossRefPubMed
20.
go back to reference Roos LL, Wajda A, Nicol JP: The art and science of record linkage: methods that work with few identifiers. Comput Biomed Med. 1986, 16 (1): 45-47. 10.1016/0010-4825(86)90061-2.CrossRef Roos LL, Wajda A, Nicol JP: The art and science of record linkage: methods that work with few identifiers. Comput Biomed Med. 1986, 16 (1): 45-47. 10.1016/0010-4825(86)90061-2.CrossRef
21.
go back to reference Kendrick S, Douglas M, Gardner D, Hucker D: Best-link matching of Scottish health data sets. Methods Inf Med. 1998, 37 (1): 64-PubMed Kendrick S, Douglas M, Gardner D, Hucker D: Best-link matching of Scottish health data sets. Methods Inf Med. 1998, 37 (1): 64-PubMed
22.
go back to reference Winkler WE: Advanced methods for record linkage. Statistical Research Report. 1994, Washington D C: U S Bureau of the Census, Statistical Research Division Winkler WE: Advanced methods for record linkage. Statistical Research Report. 1994, Washington D C: U S Bureau of the Census, Statistical Research Division
23.
go back to reference Winkler WE: Using the EM algorithm for weight computation in the Fellegi-Sunter Model of record linkage. Edited by: Census UBot. 2000, Washington DC, 12- Winkler WE: Using the EM algorithm for weight computation in the Fellegi-Sunter Model of record linkage. Edited by: Census UBot. 2000, Washington DC, 12-
24.
go back to reference Herzog TH, Scheuren F, Winkler WE: Record linkage. Wires Computational Statistics. 2010, New York: John Wiley Sons, 9- Herzog TH, Scheuren F, Winkler WE: Record linkage. Wires Computational Statistics. 2010, New York: John Wiley Sons, 9-
25.
go back to reference Randall SM, Ferrante AM, Boyd JH, Semmens JB: The effect of data cleaning on record linkage quality. BMC Med Inform Decis Mak. 2013, 13 (1): 64-10.1186/1472-6947-13-64.CrossRefPubMedPubMedCentral Randall SM, Ferrante AM, Boyd JH, Semmens JB: The effect of data cleaning on record linkage quality. BMC Med Inform Decis Mak. 2013, 13 (1): 64-10.1186/1472-6947-13-64.CrossRefPubMedPubMedCentral
27.
go back to reference Ferrante A, Boyd J: Data Linkage Software Evaluation: A First Report (Part I). 2010, Perth: Curtin University Ferrante A, Boyd J: Data Linkage Software Evaluation: A First Report (Part I). 2010, Perth: Curtin University
28.
go back to reference Kendrick SW, McIlroy R: One Pass Linkage: The Rapid Creation of Patient-based Data. Proceedings of Healthcare Computing1996. 1996, Weybridge, Surrey: British Journal of Healthcare Computing Books Kendrick SW, McIlroy R: One Pass Linkage: The Rapid Creation of Patient-based Data. Proceedings of Healthcare Computing1996. 1996, Weybridge, Surrey: British Journal of Healthcare Computing Books
29.
go back to reference Newcombe HB: Handbook for Record Linkage: Methods for Health and Statistical Studies, Administration and Business. 1988, New York: Oxford University Press Newcombe HB: Handbook for Record Linkage: Methods for Health and Statistical Studies, Administration and Business. 1988, New York: Oxford University Press
30.
go back to reference Newcombe H: Age-related bias in probabilistic death searches Due to neglect of the “Prior Likelihoods”. Comput Biomed Res. 1995, 28 (2): 87-99. 10.1006/cbmr.1995.1007.CrossRefPubMed Newcombe H: Age-related bias in probabilistic death searches Due to neglect of the “Prior Likelihoods”. Comput Biomed Res. 1995, 28 (2): 87-99. 10.1006/cbmr.1995.1007.CrossRefPubMed
Metadata
Title
Technical challenges of providing record linkage services for research
Authors
James H Boyd
Sean M Randall
Anna M Ferrante
Jacqueline K Bauer
Adrian P Brown
James B Semmens
Publication date
01-12-2014
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2014
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/1472-6947-14-23

Other articles of this Issue 1/2014

BMC Medical Informatics and Decision Making 1/2014 Go to the issue