Skip to main content
Top
Published in: Systematic Reviews 1/2021

Open Access 01-12-2021 | Research

Text mining to support abstract screening for knowledge syntheses: a semi-automated workflow

Authors: Ba’ Pham, Jelena Jovanovic, Ebrahim Bagheri, Jesmin Antony, Huda Ashoor, Tam T. Nguyen, Patricia Rios, Reid Robson, Sonia M. Thomas, Jennifer Watt, Sharon E. Straus, Andrea C. Tricco

Published in: Systematic Reviews | Issue 1/2021

Login to get access

Abstract

Background

Current text mining tools supporting abstract screening in systematic reviews are not widely used, in part because they lack sensitivity and precision. We set out to develop an accessible, semi-automated “workflow” to conduct abstract screening for systematic reviews and other knowledge synthesis methods.

Methods

We adopt widely recommended text-mining and machine-learning methods to (1) process title-abstracts into numerical training data; and (2) train a classification model to predict eligible abstracts. The predicted abstracts are screened by human reviewers for (“true”) eligibility, and the newly eligible abstracts are used to identify similar abstracts, using near-neighbor methods, which are also screened. These abstracts, as well as their eligibility results, are used to update the classification model, and the above steps are iterated until no new eligible abstracts are identified. The workflow was implemented in R and evaluated using a systematic review of insulin formulations for type-1 diabetes (14,314 abstracts) and a scoping review of knowledge-synthesis methods (17,200 abstracts). Workflow performance was evaluated against the recommended practice of screening abstracts by 2 reviewers, independently. Standard measures were examined: sensitivity (inclusion of all truly eligible abstracts), specificity (exclusion of all truly ineligible abstracts), precision (inclusion of all truly eligible abstracts among all abstracts screened as eligible), F1-score (harmonic average of sensitivity and precision), and accuracy (correctly predicted eligible or ineligible abstracts). Workload reduction was measured as the hours the workflow saved, given only a subset of abstracts needed human screening.

Results

With respect to the systematic and scoping reviews respectively, the workflow attained 88%/89% sensitivity, 99%/99% specificity, 71%/72% precision, an F1-score of 79%/79%, 98%/97% accuracy, 63%/55% workload reduction, with 12%/11% fewer abstracts for full-text retrieval and screening, and 0%/1.5% missed studies in the completed reviews.

Conclusion

The workflow was a sensitive, precise, and efficient alternative to the recommended practice of screening abstracts with 2 reviewers. All eligible studies were identified in the first case, while 6 studies (1.5%) were missed in the second that would likely not impact the review’s conclusions. We have described the workflow in language accessible to reviewers with limited exposure to natural language processing and machine learning, and have made the code available to reviewers.
Appendix
Available only for authorised users
Literature
1.
go back to reference Higgins J, Green S. Cochrane handbook for systematic reviews of interventions Version 5.1.0. The Cochrane Collaboration; 2011. Higgins J, Green S. Cochrane handbook for systematic reviews of interventions Version 5.1.0. The Cochrane Collaboration; 2011.
3.
go back to reference Borah R, Brown AW, Capers PL, Kaiser KA. Analysis of the time and workers needed to conduct systematic reviews of medical interventions using data from the PROSPERO registry. BMJ Open. 2017;7:e012545. Borah R, Brown AW, Capers PL, Kaiser KA. Analysis of the time and workers needed to conduct systematic reviews of medical interventions using data from the PROSPERO registry. BMJ Open. 2017;7:e012545.
7.
go back to reference Joanna Briggs Institute. The Joanna Briggs Institute Reviewer’s Manual: 2014 edition. The Joanna Briggs Institute; 2014. Joanna Briggs Institute. The Joanna Briggs Institute Reviewer’s Manual: 2014 edition. The Joanna Briggs Institute; 2014.
8.
go back to reference Systematic Reviews: CRD's guidance for undertaking systematic reviews in health care. Centre for Reviews and Dissemination, University of York; 2009. Systematic Reviews: CRD's guidance for undertaking systematic reviews in health care. Centre for Reviews and Dissemination, University of York; 2009.
10.
go back to reference Pham B, Robson RC, Thomas SM, Hwee J, Page MJ, Tricco AC. Improving quality and efficiency in selecting, abstracting, and appraising studies for rapid reviews. In: Tricco AC, Langlois EV, SE S, editors. Rapid reviews to strengthen health policy and systems: a practical guide. World Health Organization, Alliance for Health Policy and Systems Research; 2016. Pham B, Robson RC, Thomas SM, Hwee J, Page MJ, Tricco AC. Improving quality and efficiency in selecting, abstracting, and appraising studies for rapid reviews. In: Tricco AC, Langlois EV, SE S, editors. Rapid reviews to strengthen health policy and systems: a practical guide. World Health Organization, Alliance for Health Policy and Systems Research; 2016.
15.
go back to reference Thomas J, Brunton J, Graziosi S. EPPI-reviewer 4: software for research synthesis. EPPI-Centre Software. London: Social Science Research Unit, Institute of Education; 2010. Thomas J, Brunton J, Graziosi S. EPPI-reviewer 4: software for research synthesis. EPPI-Centre Software. London: Social Science Research Unit, Institute of Education; 2010.
19.
go back to reference Waddington H, Stevenson J, Sonnenfeld A, Gaarder M. Protocol: Participation, inclusion, transparency and accountability (PITA) to improve public services in low- and middle-income countries: a systematic review. Campbell Collaboration. 2018. Waddington H, Stevenson J, Sonnenfeld A, Gaarder M. Protocol: Participation, inclusion, transparency and accountability (PITA) to improve public services in low- and middle-income countries: a systematic review. Campbell Collaboration. 2018.
21.
go back to reference Bannach-Brown A, Przybyła P, Thomas J, Rice A, Ananiadou S, Liao J, Macleod M. The use of text-mining and machine learning algorithms in systematic reviews: reducing workload in preclinical biomedical sciences and reducing human screening error. Syst Rev 2019;8(23). https://doi.org/10.1186/s13643-019-0942-7. Bannach-Brown A, Przybyła P, Thomas J, Rice A, Ananiadou S, Liao J, Macleod M. The use of text-mining and machine learning algorithms in systematic reviews: reducing workload in preclinical biomedical sciences and reducing human screening error. Syst Rev 2019;8(23). https://​doi.​org/​10.​1186/​s13643-019-0942-7.
23.
go back to reference Methods guide for effectiveness and comparative effectiveness reviews. AHRQ Publication No. 10(14)-EHC063-EF. Rockville, MD: Agency for Healthcare Research and Quality. January 2014. Chapters available at: www.effectivehealthcare.ahrq.org. Methods guide for effectiveness and comparative effectiveness reviews. AHRQ Publication No. 10(14)-EHC063-EF. Rockville, MD: Agency for Healthcare Research and Quality. January 2014. Chapters available at: www.​effectivehealthc​are.​ahrq.​org.
29.
go back to reference Blei DM, Ng AY, Jordan MI. Latent Dirichlet allocation. J Mach Learn Res. 2003;3(4-5):993–1022. Blei DM, Ng AY, Jordan MI. Latent Dirichlet allocation. J Mach Learn Res. 2003;3(4-5):993–1022.
30.
go back to reference Fiona M, Johnson M. More efficient topic modelling through a noun only approach. In: Proceedings of the Australasian Language Technology Association Workshop; 2015. Fiona M, Johnson M. More efficient topic modelling through a noun only approach. In: Proceedings of the Australasian Language Technology Association Workshop; 2015.
31.
go back to reference Pennington J, Socher R, Manning C. GloVe: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP); 2014. Pennington J, Socher R, Manning C. GloVe: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP); 2014.
32.
go back to reference Beam AL, Kompa B, Schmaltz A, Fried I, Weber G, Palmer N, Shi X, Cai T, Kohane IS. Clinical Concept Embeddings Learned from massive sources of multimodal medical data. Pac Symp Biocomput. 2020;25:295–306. Beam AL, Kompa B, Schmaltz A, Fried I, Weber G, Palmer N, Shi X, Cai T, Kohane IS. Clinical Concept Embeddings Learned from massive sources of multimodal medical data. Pac Symp Biocomput. 2020;25:295–306.
33.
go back to reference Boyack K, Newman D, Duhon R. Clustering more than two million biomedical publications: comparing the accuracies of nine text-based similarity approaches. PloS One. 2011;6(3):e18029.CrossRef Boyack K, Newman D, Duhon R. Clustering more than two million biomedical publications: comparing the accuracies of nine text-based similarity approaches. PloS One. 2011;6(3):e18029.CrossRef
34.
go back to reference Huang A. Similarity Measures for Text Document Clustering. New Zealand: Computer Science Research Student Conference; 2008. Huang A. Similarity Measures for Text Document Clustering. New Zealand: Computer Science Research Student Conference; 2008.
35.
go back to reference Kusner M, Sun Y, Kolkin N, Weinberger K. From word embeddings to document distances. Proceedings of the 32nd International Conference on Machine Learning, Lille France. PMLR 2015;37:957–66. Kusner M, Sun Y, Kolkin N, Weinberger K. From word embeddings to document distances. Proceedings of the 32nd International Conference on Machine Learning, Lille France. PMLR 2015;37:957–66.
39.
go back to reference James G, Witten D, Hastie T. R T. An introduction to statistical learning with applications in R. New York: Springer Science, Business Media; 2017. James G, Witten D, Hastie T. R T. An introduction to statistical learning with applications in R. New York: Springer Science, Business Media; 2017.
41.
go back to reference Kuhn M. Building predictive models in R using the caret Package. J Stat Softw. 2008;28:1–26.CrossRef Kuhn M. Building predictive models in R using the caret Package. J Stat Softw. 2008;28:1–26.CrossRef
42.
go back to reference Tricco A. Comparative efficacy and safety of intermediate-acting, long acting and biosimilar insulins for type 1 Diabetes mellitus: a systematic review and network meta-analysis - a study protocol. Open Sci Framework. 2017; https://osf.io/xgfud, Assessed 04 March 2021. Tricco A. Comparative efficacy and safety of intermediate-acting, long acting and biosimilar insulins for type 1 Diabetes mellitus: a systematic review and network meta-analysis - a study protocol. Open Sci Framework. 2017; https://​osf.​io/​xgfud, Assessed 04 March 2021.
48.
go back to reference Ng L, Pitt V, Huckvale K, Clavisi O, Turner T, Gruen R, et al. Title and Abstract Screening and Evaluation in Systematic Reviews (TASER): a pilot randomised controlled trial of title and abstract screening by medical students. Syst Rev. 2014;3:121.CrossRef Ng L, Pitt V, Huckvale K, Clavisi O, Turner T, Gruen R, et al. Title and Abstract Screening and Evaluation in Systematic Reviews (TASER): a pilot randomised controlled trial of title and abstract screening by medical students. Syst Rev. 2014;3:121.CrossRef
Metadata
Title
Text mining to support abstract screening for knowledge syntheses: a semi-automated workflow
Authors
Ba’ Pham
Jelena Jovanovic
Ebrahim Bagheri
Jesmin Antony
Huda Ashoor
Tam T. Nguyen
Patricia Rios
Reid Robson
Sonia M. Thomas
Jennifer Watt
Sharon E. Straus
Andrea C. Tricco
Publication date
01-12-2021
Publisher
BioMed Central
Published in
Systematic Reviews / Issue 1/2021
Electronic ISSN: 2046-4053
DOI
https://doi.org/10.1186/s13643-021-01700-x

Other articles of this Issue 1/2021

Systematic Reviews 1/2021 Go to the issue