Skip to main content
Top
Published in: Systematic Reviews 1/2019

Open Access 01-12-2019 | Commentary

A question of trust: can we build an evidence base to gain trust in systematic review automation technologies?

Authors: Annette M. O’Connor, Guy Tsafnat, James Thomas, Paul Glasziou, Stephen B. Gilbert, Brian Hutton

Published in: Systematic Reviews | Issue 1/2019

Login to get access

Abstract

Background

Although many aspects of systematic reviews use computational tools, systematic reviewers have been reluctant to adopt machine learning tools.

Discussion

We discuss that the potential reason for the slow adoption of machine learning tools into systematic reviews is multifactorial. We focus on the current absence of trust in automation and set-up challenges as major barriers to adoption. It is important that reviews produced using automation tools are considered non-inferior or superior to current practice. However, this standard will likely not be sufficient to lead to widespread adoption. As with many technologies, it is important that reviewers see “others” in the review community using automation tools. Adoption will also be slow if the automation tools are not compatible with workflows and tasks currently used to produce reviews. Many automation tools being developed for systematic reviews mimic classification problems. Therefore, the evidence that these automation tools are non-inferior or superior can be presented using methods similar to diagnostic test evaluations, i.e., precision and recall compared to a human reviewer. However, the assessment of automation tools does present unique challenges for investigators and systematic reviewers, including the need to clarify which metrics are of interest to the systematic review community and the unique documentation challenges for reproducible software experiments.

Conclusion

We discuss adoption barriers with the goal of providing tool developers with guidance as to how to design and report such evaluations and for end users to assess their validity. Further, we discuss approaches to formatting and announcing publicly available datasets suitable for assessment of automation technologies and tools. Making these resources available will increase trust that tools are non-inferior or superior to current practice. Finally, we identify that, even with evidence that automation tools are non-inferior or superior to current practice, substantial set-up challenges remain for main stream integration of automation into the systematic review process.
Literature
1.
go back to reference Hoffmann S, de Vries RBM, Stephens ML, Beck NB, Dirven H, Fowle JR 3rd, Goodman JE, Hartung T, Kimber I, Lalu MM, et al. A primer on systematic reviews in toxicology. Arch Toxicol. 2017;91:2551–75.CrossRef Hoffmann S, de Vries RBM, Stephens ML, Beck NB, Dirven H, Fowle JR 3rd, Goodman JE, Hartung T, Kimber I, Lalu MM, et al. A primer on systematic reviews in toxicology. Arch Toxicol. 2017;91:2551–75.CrossRef
2.
go back to reference Aiassa E, Higgins JP, Frampton GK, Greiner M, Afonso A, Amzal B, Deeks J, Dorne JL, Glanville J, Lovei GL, et al. Applicability and feasibility of systematic review for performing evidence-based risk assessment in food and feed safety. Crit Rev Food Sci Nutr. 2015;55:1026–34.CrossRef Aiassa E, Higgins JP, Frampton GK, Greiner M, Afonso A, Amzal B, Deeks J, Dorne JL, Glanville J, Lovei GL, et al. Applicability and feasibility of systematic review for performing evidence-based risk assessment in food and feed safety. Crit Rev Food Sci Nutr. 2015;55:1026–34.CrossRef
3.
go back to reference Fox DM. Evidence and health policy: using and regulating systematic reviews. Am J Public Health. 2017;107:88–92.CrossRef Fox DM. Evidence and health policy: using and regulating systematic reviews. Am J Public Health. 2017;107:88–92.CrossRef
4.
go back to reference Maynard BR, Dell NA. Use and impacts of Campbell systematic reviews on policy, practice, and research. Res Soc Work Pract. 2018;28:13–8.CrossRef Maynard BR, Dell NA. Use and impacts of Campbell systematic reviews on policy, practice, and research. Res Soc Work Pract. 2018;28:13–8.CrossRef
5.
go back to reference Orton L, Lloyd-Williams F, Taylor-Robinson D, O'Flaherty M, Capewell S. The use of research evidence in public health decision making processes: systematic review. PLoS One. 2011;6:e21704.CrossRef Orton L, Lloyd-Williams F, Taylor-Robinson D, O'Flaherty M, Capewell S. The use of research evidence in public health decision making processes: systematic review. PLoS One. 2011;6:e21704.CrossRef
6.
go back to reference Fox DM. Systematic reviews and health policy: the influence of a project on perinatal care since 1988. Milbank Q. 2011;89:425–49.CrossRef Fox DM. Systematic reviews and health policy: the influence of a project on perinatal care since 1988. Milbank Q. 2011;89:425–49.CrossRef
7.
go back to reference Al-Zubidy A, Carver JC, Hale DP, Hassler EE. Vision for SLR tooling infrastructure: prioritizing value-added requirements. Inf Softw Technol. 2017;91:72–81.CrossRef Al-Zubidy A, Carver JC, Hale DP, Hassler EE. Vision for SLR tooling infrastructure: prioritizing value-added requirements. Inf Softw Technol. 2017;91:72–81.CrossRef
8.
go back to reference Nolan CT, Garavan TN. Human resource development in SMEs: a systematic review of the literature. Int J Manag Rev. 2016;18:85–107.CrossRef Nolan CT, Garavan TN. Human resource development in SMEs: a systematic review of the literature. Int J Manag Rev. 2016;18:85–107.CrossRef
9.
go back to reference Radant O, Colomo-Palacios R, Stantchev V. Factors for the management of scarce human resources and highly skilled employees in IT-departments: a systematic review. J Inf Technol Res. 2016;9:65–82.CrossRef Radant O, Colomo-Palacios R, Stantchev V. Factors for the management of scarce human resources and highly skilled employees in IT-departments: a systematic review. J Inf Technol Res. 2016;9:65–82.CrossRef
10.
go back to reference Borah R, Brown AW, Capers PL, Kaiser KA. Analysis of the time and workers needed to conduct systematic reviews of medical interventions using data from the PROSPERO registry. BMJ Open. 2017;7:e012545 (012017).CrossRef Borah R, Brown AW, Capers PL, Kaiser KA. Analysis of the time and workers needed to conduct systematic reviews of medical interventions using data from the PROSPERO registry. BMJ Open. 2017;7:e012545 (012017).CrossRef
11.
go back to reference Vandvik PO, Brignardello-Petersen R, Guyatt GH. Living cumulative network meta-analysis to reduce waste in research: a paradigmatic shift for systematic reviews? BMC Med. 2016;14:59.CrossRef Vandvik PO, Brignardello-Petersen R, Guyatt GH. Living cumulative network meta-analysis to reduce waste in research: a paradigmatic shift for systematic reviews? BMC Med. 2016;14:59.CrossRef
12.
go back to reference Elliott JH, Turner T, Clavisi O, Thomas J, Higgins JP, Mavergames C, Gruen RL. Living systematic reviews: an emerging opportunity to narrow the evidence-practice gap. PLoS Med. 2014;11:e1001603.CrossRef Elliott JH, Turner T, Clavisi O, Thomas J, Higgins JP, Mavergames C, Gruen RL. Living systematic reviews: an emerging opportunity to narrow the evidence-practice gap. PLoS Med. 2014;11:e1001603.CrossRef
13.
go back to reference Vagia M, Transeth AA, Fjerdingen SA. A literature review on the levels of automation during the years. What are the different taxonomies that have been proposed? Appl Ergon. 2016;53:190–202.CrossRef Vagia M, Transeth AA, Fjerdingen SA. A literature review on the levels of automation during the years. What are the different taxonomies that have been proposed? Appl Ergon. 2016;53:190–202.CrossRef
14.
go back to reference Knight I, Wilson M, Brailsford D, Milic-Frayling N. “Enslaved to the trapped data”: a cognitive work analysis of medical systematic reviews. In: In Proceedings of 2019 ACM SIGIR Conference on Human Information Interaction and Retrieval. Glasgow: CHIIR2019; 2019. 10–14 March 2019: 10 pages. Knight I, Wilson M, Brailsford D, Milic-Frayling N. “Enslaved to the trapped data”: a cognitive work analysis of medical systematic reviews. In: In Proceedings of 2019 ACM SIGIR Conference on Human Information Interaction and Retrieval. Glasgow: CHIIR2019; 2019. 10–14 March 2019: 10 pages.
15.
go back to reference Aphinyanaphongs Y, Tsamardinos I, Statnikov A, Hardin D, Aliferis CF. Text categorization models for high-quality article retrieval in internal medicine. J Am Med Inform Assoc. 2005;12:207–16.CrossRef Aphinyanaphongs Y, Tsamardinos I, Statnikov A, Hardin D, Aliferis CF. Text categorization models for high-quality article retrieval in internal medicine. J Am Med Inform Assoc. 2005;12:207–16.CrossRef
16.
go back to reference O'Mara-Eves A, Thomas J, McNaught J, Miwa M, Ananiadou S. Erratum to: using text mining for study identification in systematic reviews: a systematic review of current approaches. Syst Rev. 2015;4:5.CrossRef O'Mara-Eves A, Thomas J, McNaught J, Miwa M, Ananiadou S. Erratum to: using text mining for study identification in systematic reviews: a systematic review of current approaches. Syst Rev. 2015;4:5.CrossRef
18.
go back to reference Bekhuis T, Demner-Fushman D. Towards automating the initial screening phase of a systematic review. Stud Health Technol Inform. 2010;160(Pt 1):146–50.PubMed Bekhuis T, Demner-Fushman D. Towards automating the initial screening phase of a systematic review. Stud Health Technol Inform. 2010;160(Pt 1):146–50.PubMed
20.
go back to reference Hemens BJ, Iorio A. Computer-aided systematic review screening comes of age. Ann Intern Med. 2017;167:210.CrossRef Hemens BJ, Iorio A. Computer-aided systematic review screening comes of age. Ann Intern Med. 2017;167:210.CrossRef
21.
go back to reference Davis FD. Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Q. 1989;13:319–40.CrossRef Davis FD. Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Q. 1989;13:319–40.CrossRef
22.
go back to reference Rogers EM. Diffusion of innovations. 5th ed. New York: Free Press; 2003. Rogers EM. Diffusion of innovations. 5th ed. New York: Free Press; 2003.
23.
go back to reference Thomas J. Diffusion of innovation in systematic review methodology: why is study selection not yet assisted by automation? OA Evid Based Med. 2013;1:1–6.CrossRef Thomas J. Diffusion of innovation in systematic review methodology: why is study selection not yet assisted by automation? OA Evid Based Med. 2013;1:1–6.CrossRef
24.
go back to reference Tsafnat G, Dunn A, Glasziou P, Coiera E. The automation of systematic reviews. BMJ. 2013;346:f139.CrossRef Tsafnat G, Dunn A, Glasziou P, Coiera E. The automation of systematic reviews. BMJ. 2013;346:f139.CrossRef
25.
go back to reference Tsafnat G, Glasziou P, Choong MK, Dunn A, Galgani F, Coiera E. Systematic review automation technologies. Syst Rev. 2014;3:74.CrossRef Tsafnat G, Glasziou P, Choong MK, Dunn A, Galgani F, Coiera E. Systematic review automation technologies. Syst Rev. 2014;3:74.CrossRef
26.
go back to reference Kelly D, Sugimoto CR. A systematic review of interactive information retrieval evaluation studies, 1967–2006. J Am Soc Inf Sci Technol. 2013;64:745–70.CrossRef Kelly D, Sugimoto CR. A systematic review of interactive information retrieval evaluation studies, 1967–2006. J Am Soc Inf Sci Technol. 2013;64:745–70.CrossRef
27.
go back to reference Brereton P, Kitchenham BA, Budgen D, Turner M, Khalil M. Lessons from applying the systematic literature review process within the software engineering domain. J Syst Softw. 2007;80:571–83.CrossRef Brereton P, Kitchenham BA, Budgen D, Turner M, Khalil M. Lessons from applying the systematic literature review process within the software engineering domain. J Syst Softw. 2007;80:571–83.CrossRef
28.
go back to reference Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 [updated March 2011]. (Higgins J, Green S eds.): The Cochrane Collaboration; 2011. Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 [updated March 2011]. (Higgins J, Green S eds.): The Cochrane Collaboration; 2011.
30.
go back to reference Miller J. Replicating software engineering experiments: a poisoned chalice or the Holy Grail. Inf Softw Technol. 2005;47:233–44.CrossRef Miller J. Replicating software engineering experiments: a poisoned chalice or the Holy Grail. Inf Softw Technol. 2005;47:233–44.CrossRef
31.
go back to reference Wallace BC, Kuiper J, Sharma A, Zhu MB, Marshall IJ. Extracting PICO sentences from clinical trial reports using supervised distant supervision. J Mach Learn Res. 2016;17. Wallace BC, Kuiper J, Sharma A, Zhu MB, Marshall IJ. Extracting PICO sentences from clinical trial reports using supervised distant supervision. J Mach Learn Res. 2016;17.
32.
go back to reference Kiritchenko S, de Bruijn B, Carini S, Martin J, Sim I. ExaCT: automatic extraction of clinical trial characteristics from journal publications. BMC Med Inform Decis Mak. 2010;10:56.CrossRef Kiritchenko S, de Bruijn B, Carini S, Martin J, Sim I. ExaCT: automatic extraction of clinical trial characteristics from journal publications. BMC Med Inform Decis Mak. 2010;10:56.CrossRef
33.
go back to reference Pepe MS, Janes H. Insights into latent class analysis of diagnostic test performance. Biostatistics. 2007;8:474–84.CrossRef Pepe MS, Janes H. Insights into latent class analysis of diagnostic test performance. Biostatistics. 2007;8:474–84.CrossRef
34.
go back to reference Collins J, Huynh M. Estimation of diagnostic test accuracy without full verification: a review of latent class methods. Stat Med. 2014;33:4141–69.CrossRef Collins J, Huynh M. Estimation of diagnostic test accuracy without full verification: a review of latent class methods. Stat Med. 2014;33:4141–69.CrossRef
35.
go back to reference Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig L, Lijmer JG, Moher D, Rennie D, de Vet HC, et al. STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. BMJ. 2015;351:h5527.CrossRef Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig L, Lijmer JG, Moher D, Rennie D, de Vet HC, et al. STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. BMJ. 2015;351:h5527.CrossRef
36.
go back to reference Olorisade BK, Brereton P, Andras P. Reproducibility of studies on text mining for citation screening in systematic reviews: evaluation and checklist. J Biomed Inform. 2017;73:1–13.CrossRef Olorisade BK, Brereton P, Andras P. Reproducibility of studies on text mining for citation screening in systematic reviews: evaluation and checklist. J Biomed Inform. 2017;73:1–13.CrossRef
37.
go back to reference Berez-Kroeker Andrea L, Gawne L, Kung Susan S, Kelly Barbara F, Heston T, Holton G, Pulsifer P, Beaver David I, Chelliah S, Dubinsky S, et al. Reproducible research in linguistics: a position statement on data citation and attribution in our field. In: Linguistics, vol. 56; 2018. p. 1. Berez-Kroeker Andrea L, Gawne L, Kung Susan S, Kelly Barbara F, Heston T, Holton G, Pulsifer P, Beaver David I, Chelliah S, Dubinsky S, et al. Reproducible research in linguistics: a position statement on data citation and attribution in our field. In: Linguistics, vol. 56; 2018. p. 1.
Metadata
Title
A question of trust: can we build an evidence base to gain trust in systematic review automation technologies?
Authors
Annette M. O’Connor
Guy Tsafnat
James Thomas
Paul Glasziou
Stephen B. Gilbert
Brian Hutton
Publication date
01-12-2019
Publisher
BioMed Central
Published in
Systematic Reviews / Issue 1/2019
Electronic ISSN: 2046-4053
DOI
https://doi.org/10.1186/s13643-019-1062-0

Other articles of this Issue 1/2019

Systematic Reviews 1/2019 Go to the issue