Skip to main content
Top
Published in: Drug Safety 4/2014

01-04-2014 | Original Research Article

Performance of Probabilistic Method to Detect Duplicate Individual Case Safety Reports

Authors: Philip Michael Tregunno, Dorthe Bech Fink, Cristina Fernandez-Fernandez, Edurne Lázaro-Bengoa, G. Niklas Norén

Published in: Drug Safety | Issue 4/2014

Login to get access

Abstract

Background

Individual case reports of suspected harm from medicines are fundamental for signal detection in postmarketing surveillance. Their effective analysis requires reliable data and one challenge is report duplication. These are multiple unlinked records describing the same suspected adverse drug reaction (ADR) in a particular patient. They distort statistical screening and can mislead clinical assessment. Many organisations rely on rule-based detection, but probabilistic record matching is an alternative.

Objectives

The aim of this study was to evaluate probabilistic record matching for duplicate detection, and to characterise the main sources of duplicate reports within each data set.

Research Design

vigiMatch™, a published probabilistic record matching algorithm, was applied to the WHO global individual case safety reports database, VigiBase®, for reports submitted between 2000 and 2010. Reported drugs, ADRs, patient age, sex, country of origin, and date of onset were considered in the matching. Suspected duplicates for the UK, Denmark, and Spain were reviewed and classified by the respective national centre. This included evaluation to determine whether confirmed duplicates had already been identified by in-house, rule-based screening. Furthermore, each confirmed duplicate was classified with respect to the likely source of duplication.

Measures

For each country, the proportions of suspected duplicates classified as confirmed duplicates, likely duplicates, otherwise related, and unrelated were obtained. The proportions of confirmed or likely duplicates that were not previously known by the national organisation were determined, and variations in the rates of suspected duplicates across subsets of reports were characterised.

Results

Overall, 2.5 % of the reports with sufficient information to be evaluated by vigiMatch were classified as suspected duplicates. The rates for the three countries considered in this study were 1.4 % (UK), 1.0 % (Denmark), and 0.7 % (Spain). Higher rates of suspected duplicates were observed for literature reports (11 %) and reports with fatal outcome (5 %), whereas a lower rate was observed for reports from consumers and non-health professionals (0.5 %). The predictive value for confirmed or likely duplicates among reports flagged as suspected duplicates by vigiMatch ranged from 86 % for the UK, to 64 % for Denmark and 33 % for Spain. The proportions of confirmed duplicates that were previously unknown to national centres ranged from 89 % for Spain, to 60 % for the UK and 38 % for Denmark, despite in-house duplicate detection processes in routine use. The proportion of unrelated cases among suspected duplicates were below 10 % for each national centre in the study.

Conclusions

Probabilistic record matching, as implemented in vigiMatch, achieved good predictive value for confirmed or likely duplicates in each data source. Most of the false positives corresponded to otherwise related reports; less than 10 % were altogether unrelated. A substantial proportion of the correctly identified duplicates had not previously been detected by national centre activity. On one hand, vigiMatch highlighted duplicates that had been missed by rule-based methods, and on the other hand its lower total number of suspected duplicates to review improved the accuracy of manual review.
Footnotes
1
Matching drugs tend to be highly rewarded by vigiMatch, so the more drugs are listed on a report, the more likely they are to receive a sufficiently high score when matched against themselves.
 
Literature
1.
2.
go back to reference Norén GN, Bate A, Orre R. A hit-miss model for duplicate detection in the WHO drug safety database. In: KDD '05 Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. New York, USA: ACM; 2005. pp. 459–68. doi:10.1145/1081870.1081923 Norén GN, Bate A, Orre R. A hit-miss model for duplicate detection in the WHO drug safety database. In: KDD '05 Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. New York, USA: ACM; 2005. pp. 459–68. doi:10.​1145/​1081870.​1081923
3.
go back to reference Almenoff J, Tonning JM, Gould AL, et al. Perspectives on the use of data mining in pharmacovigilance. Drug Saf. 2005;28(11):981–1007.PubMedCrossRef Almenoff J, Tonning JM, Gould AL, et al. Perspectives on the use of data mining in pharmacovigilance. Drug Saf. 2005;28(11):981–1007.PubMedCrossRef
4.
go back to reference Norén GN, Orre R, Bate A, Edwards IR. Duplicate detection in adverse drug reaction surveillance. Data Min Knowl Discov. 2007;2007(14):305–28.CrossRef Norén GN, Orre R, Bate A, Edwards IR. Duplicate detection in adverse drug reaction surveillance. Data Min Knowl Discov. 2007;2007(14):305–28.CrossRef
5.
go back to reference Hauben M, Reich L, DeMicco J, Kim K. ‘Extreme duplication’ in the US FDA Adverse Events Reporting System database. Drug Saf. 2007;30(6):551–4.PubMedCrossRef Hauben M, Reich L, DeMicco J, Kim K. ‘Extreme duplication’ in the US FDA Adverse Events Reporting System database. Drug Saf. 2007;30(6):551–4.PubMedCrossRef
6.
go back to reference Lindquist M. Vigibase, the WHO global ICSR database system: basic facts. Drug Inf J. 2008;42(5):409–19. Lindquist M. Vigibase, the WHO global ICSR database system: basic facts. Drug Inf J. 2008;42(5):409–19.
7.
go back to reference Olsson S. The role of the WHO programme on International Drug Monitoring in coordinating worldwide drug safety efforts. Drug Saf. 1998;19(1):1–10.PubMedCrossRef Olsson S. The role of the WHO programme on International Drug Monitoring in coordinating worldwide drug safety efforts. Drug Saf. 1998;19(1):1–10.PubMedCrossRef
8.
go back to reference Copas JB, Hilton FJ. Record linkage: statistical models for matching computer records. J R Stat Soc Ser A Stat Soc. 1990;153(3):287–320.PubMedCrossRef Copas JB, Hilton FJ. Record linkage: statistical models for matching computer records. J R Stat Soc Ser A Stat Soc. 1990;153(3):287–320.PubMedCrossRef
Metadata
Title
Performance of Probabilistic Method to Detect Duplicate Individual Case Safety Reports
Authors
Philip Michael Tregunno
Dorthe Bech Fink
Cristina Fernandez-Fernandez
Edurne Lázaro-Bengoa
G. Niklas Norén
Publication date
01-04-2014
Publisher
Springer International Publishing
Published in
Drug Safety / Issue 4/2014
Print ISSN: 0114-5916
Electronic ISSN: 1179-1942
DOI
https://doi.org/10.1007/s40264-014-0146-y

Other articles of this Issue 4/2014

Drug Safety 4/2014 Go to the issue