Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2020

Open Access 01-12-2020 | Artificial Intelligence | Research article

An evaluation of DistillerSR’s machine learning-based prioritization tool for title/abstract screening – impact on reviewer-relevant outcomes

Authors: C. Hamel, S. E. Kelly, K. Thavorn, D. B. Rice, G. A. Wells, B. Hutton

Published in: BMC Medical Research Methodology | Issue 1/2020

Login to get access

Abstract

Background

Systematic reviews often require substantial resources, partially due to the large number of records identified during searching. Although artificial intelligence may not be ready to fully replace human reviewers, it may accelerate and reduce the screening burden. Using DistillerSR (May 2020 release), we evaluated the performance of the prioritization simulation tool to determine the reduction in screening burden and time savings.

Methods

Using a true recall @ 95%, response sets from 10 completed systematic reviews were used to evaluate: (i) the reduction of screening burden; (ii) the accuracy of the prioritization algorithm; and (iii) the hours saved when a modified screening approach was implemented. To account for variation in the simulations, and to introduce randomness (through shuffling the references), 10 simulations were run for each review. Means, standard deviations, medians and interquartile ranges (IQR) are presented.

Results

Among the 10 systematic reviews, using true recall @ 95% there was a median reduction in screening burden of 47.1% (IQR: 37.5 to 58.0%). A median of 41.2% (IQR: 33.4 to 46.9%) of the excluded records needed to be screened to achieve true recall @ 95%. The median title/abstract screening hours saved using a modified screening approach at a true recall @ 95% was 29.8 h (IQR: 28.1 to 74.7 h). This was increased to a median of 36 h (IQR: 32.2 to 79.7 h) when considering the time saved not retrieving and screening full texts of the remaining 5% of records not yet identified as included at title/abstract. Among the 100 simulations (10 simulations per review), none of these 5% of records were a final included study in the systematic review. The reduction in screening burden to achieve true recall @ 95% compared to @ 100% resulted in a reduced screening burden median of 40.6% (IQR: 38.3 to 54.2%).

Conclusions

The prioritization tool in DistillerSR can reduce screening burden. A modified or stop screening approach once a true recall @ 95% is achieved appears to be a valid method for rapid reviews, and perhaps systematic reviews. This needs to be further evaluated in prospective reviews using the estimated recall.
Appendix
Available only for authorised users
Literature
2.
go back to reference Borah R, Brown A, Capers P, Kaiser K. Analysis of the time and workers needed to conduct systematic reviews of medical interventions using data from the PROSPERO registry. BMJ Open. 2017;7:e012545.CrossRef Borah R, Brown A, Capers P, Kaiser K. Analysis of the time and workers needed to conduct systematic reviews of medical interventions using data from the PROSPERO registry. BMJ Open. 2017;7:e012545.CrossRef
15.
go back to reference Tsou AY, Treadwell JR, Erinoff E, Schoelles K. Machine learning for screening prioritization in systematic reviews: comparative performance of Abstrackr and EPPI-reviewer. Syst Rev. 2020;9:73.CrossRef Tsou AY, Treadwell JR, Erinoff E, Schoelles K. Machine learning for screening prioritization in systematic reviews: comparative performance of Abstrackr and EPPI-reviewer. Syst Rev. 2020;9:73.CrossRef
24.
go back to reference Gates A, Gates M, Sebastianski M, Guitard S, Elliott SA, Hartling L. The semi-automation of title and abstract screening: a retrospective exploration of ways to leverage Abstrackr’s relevance predictions in systematic and rapid reviews. BMC Med Res Methodol 2020;20:139. doi: https://doi.org/10.1186/s12874-020-01031-w. Gates A, Gates M, Sebastianski M, Guitard S, Elliott SA, Hartling L. The semi-automation of title and abstract screening: a retrospective exploration of ways to leverage Abstrackr’s relevance predictions in systematic and rapid reviews. BMC Med Res Methodol 2020;20:139. doi: https://​doi.​org/​10.​1186/​s12874-020-01031-w.
29.
go back to reference Watt A, Cameron A, Sturm L, Lathlean T, Babidge W, Blamey S, et al. Rapid reviews versus full systematic reviews: an inventory of current methods and practice in health technology assessment. Int J Technol Assess Health Care. 2008;24:133–9.CrossRef Watt A, Cameron A, Sturm L, Lathlean T, Babidge W, Blamey S, et al. Rapid reviews versus full systematic reviews: an inventory of current methods and practice in health technology assessment. Int J Technol Assess Health Care. 2008;24:133–9.CrossRef
36.
41.
go back to reference Johnston A, Smith C, Zheng C, Aaron SD, Kelly SE, Skidmore B, et al. Influence of prolonged treatment with omalizumab on the development of solid epithelial cancer in patients with atopic asthma and chronic idiopathic urticaria: a systematic review and meta-analysis. Clin Exp Allergy. 2019;49:1291–305. https://doi.org/10.1111/cea.13457.CrossRefPubMed Johnston A, Smith C, Zheng C, Aaron SD, Kelly SE, Skidmore B, et al. Influence of prolonged treatment with omalizumab on the development of solid epithelial cancer in patients with atopic asthma and chronic idiopathic urticaria: a systematic review and meta-analysis. Clin Exp Allergy. 2019;49:1291–305. https://​doi.​org/​10.​1111/​cea.​13457.CrossRefPubMed
Metadata
Title
An evaluation of DistillerSR’s machine learning-based prioritization tool for title/abstract screening – impact on reviewer-relevant outcomes
Authors
C. Hamel
S. E. Kelly
K. Thavorn
D. B. Rice
G. A. Wells
B. Hutton
Publication date
01-12-2020
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2020
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/s12874-020-01129-1

Other articles of this Issue 1/2020

BMC Medical Research Methodology 1/2020 Go to the issue