Skip to main content
Top
Published in: Systematic Reviews 1/2021

Open Access 01-12-2021 | Methodology

Iterative guided machine learning-assisted systematic literature reviews: a diabetes case study

Authors: John Zimmerman, Robin E. Soler, James Lavinder, Sarah Murphy, Charisma Atkins, LaShonda Hulbert, Richard Lusk, Boon Peng Ng

Published in: Systematic Reviews | Issue 1/2021

Login to get access

Abstract

Background

Systematic Reviews (SR), studies of studies, use a formal process to evaluate the quality of scientific literature and determine ensuing effectiveness from qualifying articles to establish consensus findings around a hypothesis. Their value is increasing as the conduct and publication of research and evaluation has expanded and the process of identifying key insights becomes more time consuming. Text analytics and machine learning (ML) techniques may help overcome this problem of scale while still maintaining the level of rigor expected of SRs.

Methods

In this article, we discuss an approach that uses existing examples of SRs to build and test a method for assisting the SR title and abstract pre-screening by reducing the initial pool of potential articles down to articles that meet inclusion criteria. Our approach differs from previous approaches to using ML as a SR tool in that it incorporates ML configurations guided by previously conducted SRs, and human confirmation on ML predictions of relevant articles during multiple iterative reviews on smaller tranches of citations. We applied the tailored method to a new SR review effort to validate performance.

Results

The case study test of the approach proved a sensitivity (recall) in finding relevant articles during down selection that may rival many traditional processes and show ability to overcome most type II errors. The study achieved a sensitivity of 99.5% (213 out of 214) of total relevant articles while only conducting a human review of 31% of total articles available for review.

Conclusions

We believe this iterative method can help overcome bias in initial ML model training by having humans reinforce ML models with new and relevant information, and is an applied step towards transfer learning for ML in SR.
Appendix
Available only for authorised users
Literature
2.
go back to reference Munn Z, Stern C, Lockwood C & Jordan Z. What kind of systematic review should I conduct? A proposed typology and guidance for systematic reviewers in the medical and health sciences. BMC Med Res Methodol, 2018:18:5. https://doi.org/https://doi.org/10.1186/s12874-017-0468-4, 1 Munn Z, Stern C, Lockwood C & Jordan Z. What kind of systematic review should I conduct? A proposed typology and guidance for systematic reviewers in the medical and health sciences. BMC Med Res Methodol, 2018:18:5. https://​doi.​org/​https://​doi.​org/​10.​1186/​s12874-017-0468-4, 1
3.
go back to reference Tsafnat G, Glasziou P, Choong MK, Dunn A, Galgani F, Coiera E. Systematic review automation technologies. Syst Rev. 2014;3:1–15.CrossRef Tsafnat G, Glasziou P, Choong MK, Dunn A, Galgani F, Coiera E. Systematic review automation technologies. Syst Rev. 2014;3:1–15.CrossRef
4.
go back to reference Thomas J, Noel-Storr A, Marshall I, Wallace B, McDonald S, Mavergames S, et al. Living systematic reviews: 2. Combining human and machine effort. J of Clin Epi. 2017;91:31–7.CrossRef Thomas J, Noel-Storr A, Marshall I, Wallace B, McDonald S, Mavergames S, et al. Living systematic reviews: 2. Combining human and machine effort. J of Clin Epi. 2017;91:31–7.CrossRef
9.
go back to reference Bannach-Brown, A., Przybyła, P., Thomas, J., Rice, A. S. C., Ananiadou, S., Liao, J., & Macleod, M. R. Machine learning algorithms for systematic reviews: reducing workload in a preclinical review of animal studies and reducing human screening error. Syst Rev 8(23) 2019. https://doi.org/https://doi.org/10.1101/255760 Bannach-Brown, A., Przybyła, P., Thomas, J., Rice, A. S. C., Ananiadou, S., Liao, J., & Macleod, M. R. Machine learning algorithms for systematic reviews: reducing workload in a preclinical review of animal studies and reducing human screening error. Syst Rev 8(23) 2019. https://​doi.​org/​https://​doi.​org/​10.​1101/​255760
10.
go back to reference Kosiantis SB. Supervised machine learning: a review of classification techniques. Informatica. 2007;31:249–68. Kosiantis SB. Supervised machine learning: a review of classification techniques. Informatica. 2007;31:249–68.
11.
go back to reference James G. An introduction to statistical learning: with applications in R. New York, NY: Springer; 2013. p. 21–3.CrossRef James G. An introduction to statistical learning: with applications in R. New York, NY: Springer; 2013. p. 21–3.CrossRef
14.
go back to reference Mao Y, Balasubramanian K, Lebanon G. Dimensionality reduction for text using domain knowledge. COLING. 2010:801–9. Mao Y, Balasubramanian K, Lebanon G. Dimensionality reduction for text using domain knowledge. COLING. 2010:801–9.
15.
go back to reference Ramos, J. Using tf-idf to determine word relevance in document queries. In Proceedings of the first instructional conference on machine learning. 2013:Vol. 242, pp. 133-142. Ramos, J. Using tf-idf to determine word relevance in document queries. In Proceedings of the first instructional conference on machine learning. 2013:Vol. 242, pp. 133-142.
16.
go back to reference Zheng, Alice, and Amanda Casari. Feature engineering for machine learning: principles and techniques for data scientists. " O'Reilly Media, Inc.", 2018. Zheng, Alice, and Amanda Casari. Feature engineering for machine learning: principles and techniques for data scientists. " O'Reilly Media, Inc.", 2018.
17.
go back to reference Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. J Mach Learn Res. 2003;3(Jan):993–1022. Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. J Mach Learn Res. 2003;3(Jan):993–1022.
18.
go back to reference Stewart GW. On the early history of the singular value decomposition. SIAM review. 1993;35(4):551-66. Stewart GW. On the early history of the singular value decomposition. SIAM review. 1993;35(4):551-66.
20.
go back to reference Vapnik VN. The nature of statistical learning theory. New York: Springer-Verlag New York, Inc; 1995. Vapnik VN. The nature of statistical learning theory. New York: Springer-Verlag New York, Inc; 1995.
22.
go back to reference Lu S, Jin Z. Improved Stochastic gradient descent algorithm for SVM. Int J Recent Eng Science (IJRES). 2017;4(4):28-31. Lu S, Jin Z. Improved Stochastic gradient descent algorithm for SVM. Int J Recent Eng Science (IJRES). 2017;4(4):28-31.
24.
go back to reference Hsiang-Fu Y, Hung-Yi L, et al. Feature engineering and classifier ensemble for KDD Cup. J Mach Learn Res Conf Proc. 2010;2010:1–16. Hsiang-Fu Y, Hung-Yi L, et al. Feature engineering and classifier ensemble for KDD Cup. J Mach Learn Res Conf Proc. 2010;2010:1–16.
25.
go back to reference Dietterich, TG. Ensemble methods in machine learning. International workshop on multiple classifier systems. Springer, Berlin, Heidelberg; 2000. Dietterich, TG. Ensemble methods in machine learning. International workshop on multiple classifier systems. Springer, Berlin, Heidelberg; 2000.
28.
go back to reference Devlin, Jacob, et al. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. 2018. Devlin, Jacob, et al. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. 2018.
Metadata
Title
Iterative guided machine learning-assisted systematic literature reviews: a diabetes case study
Authors
John Zimmerman
Robin E. Soler
James Lavinder
Sarah Murphy
Charisma Atkins
LaShonda Hulbert
Richard Lusk
Boon Peng Ng
Publication date
01-12-2021
Publisher
BioMed Central
Published in
Systematic Reviews / Issue 1/2021
Electronic ISSN: 2046-4053
DOI
https://doi.org/10.1186/s13643-021-01640-6

Other articles of this Issue 1/2021

Systematic Reviews 1/2021 Go to the issue