Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2021

Open Access 01-12-2021 | Care | Research article

Record linkage under suboptimal conditions for data-intensive evaluation of primary care in Rio de Janeiro, Brazil

Authors: Claudia Medina Coeli, Valeria Saraceni, Paulo Mota Medeiros Jr., Helena Pereira da Silva Santos, Luis Carlos Torres Guillen, Luís Guilherme Santos Buteri Alves, Thomas Hone, Christopher Millett, Anete Trajman, Betina Durovni

Published in: BMC Medical Informatics and Decision Making | Issue 1/2021

Login to get access

Abstract

Background

Linking Brazilian databases demands the development of algorithms and processes to deal with various challenges including the large size of the databases, the low number and poor quality of personal identifiers available to be compared (national security number not mandatory), and some characteristics of Brazilian names that make the linkage process prone to errors. This study aims to describe and evaluate the quality of the processes used to create an individual-linked database for data-intensive research on the impacts on health indicators of the expansion of primary care in Rio de Janeiro City, Brazil.

Methods

We created an individual-level dataset linking social benefits recipients, primary health care, hospital admission and mortality data. The databases were pre-processed, and we adopted a multiple approach strategy combining deterministic and probabilistic record linkage techniques, and an extensive clerical review of the potential matches. Relying on manual review as the gold standard, we estimated the false match (false-positive) proportion of each approach (deterministic, probabilistic, clerical review) and the missed match proportion (false-negative) of the clerical review approach. To assess the sensitivity (recall) to identifying social benefits recipients’ deaths, we used their vital status registered on the primary care database as the gold standard.

Results

In all linkage processes, the deterministic approach identified most of the matches. However, the proportion of matches identified in each approach varied. The false match proportion was around 1% or less in almost all approaches. The missed match proportion in the clerical review approach of all linkage processes were under 3%. We estimated a recall of 93.6% (95% CI 92.8–94.3) for the linkage between social benefits recipients and mortality data.

Conclusion

The adoption of a linkage strategy combining pre-processing routines, deterministic, and probabilistic strategies, as well as an extensive clerical review approach minimized linkage errors in the context of suboptimal data quality.
Literature
10.
go back to reference Starfield B. Primary care: concept, evaluation, and policy. New York: Oxford University Press; 1992. Starfield B. Primary care: concept, evaluation, and policy. New York: Oxford University Press; 1992.
15.
go back to reference Brazil, Ministério da Saúde, Pan American Health Organization, Fundação Oswaldo Cruz. A experiência brasileira em sistemas de informação em saúde. Brasília, DF: Editora MS; 2009. Brazil, Ministério da Saúde, Pan American Health Organization, Fundação Oswaldo Cruz. A experiência brasileira em sistemas de informação em saúde. Brasília, DF: Editora MS; 2009.
18.
go back to reference Levenshtein VI. Binary codes capable of correcting deletions, insertions, and reversals. Cybern Control Theory. 1966;10(8):707–10. Levenshtein VI. Binary codes capable of correcting deletions, insertions, and reversals. Cybern Control Theory. 1966;10(8):707–10.
22.
go back to reference Dusetzina SB, Tyree S, Meyer A-M, Meyer A, Green L, Carpenter WR. Linking data for health services research: a framework and instructional guide. Rockville (MD): Agency for Healthcare Research and Quality (US); 2014 accessed in 2020 Mar 29. (AHRQ Methods for Effective Health Care). http://www.ncbi.nlm.nih.gov/books/NBK253313/. Accessed 14 Dec 2020. Dusetzina SB, Tyree S, Meyer A-M, Meyer A, Green L, Carpenter WR. Linking data for health services research: a framework and instructional guide. Rockville (MD): Agency for Healthcare Research and Quality (US); 2014 accessed in 2020 Mar 29. (AHRQ Methods for Effective Health Care). http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK253313/​. Accessed 14 Dec 2020.
24.
go back to reference Harron K, Goldstein H, Dibben C, editors. Methodological developments in data linkage. Chichester: Wiley; 2016. Harron K, Goldstein H, Dibben C, editors. Methodological developments in data linkage. Chichester: Wiley; 2016.
31.
go back to reference Lash TL, Fox MP, Fink AK. Applying quantitative bias analysis to epidemiologic data. 1st ed. Berlin: Springer; 2009.CrossRef Lash TL, Fox MP, Fink AK. Applying quantitative bias analysis to epidemiologic data. 1st ed. Berlin: Springer; 2009.CrossRef
Metadata
Title
Record linkage under suboptimal conditions for data-intensive evaluation of primary care in Rio de Janeiro, Brazil
Authors
Claudia Medina Coeli
Valeria Saraceni
Paulo Mota Medeiros Jr.
Helena Pereira da Silva Santos
Luis Carlos Torres Guillen
Luís Guilherme Santos Buteri Alves
Thomas Hone
Christopher Millett
Anete Trajman
Betina Durovni
Publication date
01-12-2021
Publisher
BioMed Central
Keyword
Care
Published in
BMC Medical Informatics and Decision Making / Issue 1/2021
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/s12911-021-01550-6

Other articles of this Issue 1/2021

BMC Medical Informatics and Decision Making 1/2021 Go to the issue