Skip to main content
Top
Published in: Systematic Reviews 1/2015

Open Access 01-12-2015 | Research

Using text mining for study identification in systematic reviews: a systematic review of current approaches

Authors: Alison O’Mara-Eves, James Thomas, John McNaught, Makoto Miwa, Sophia Ananiadou

Published in: Systematic Reviews | Issue 1/2015

Login to get access

Abstract

Background

The large and growing number of published studies, and their increasing rate of publication, makes the task of identifying relevant studies in an unbiased way for inclusion in systematic reviews both complex and time consuming. Text mining has been offered as a potential solution: through automating some of the screening process, reviewer time can be saved. The evidence base around the use of text mining for screening has not yet been pulled together systematically; this systematic review fills that research gap. Focusing mainly on non-technical issues, the review aims to increase awareness of the potential of these technologies and promote further collaborative research between the computer science and systematic review communities.

Methods

Five research questions led our review: what is the state of the evidence base; how has workload reduction been evaluated; what are the purposes of semi-automation and how effective are they; how have key contextual problems of applying text mining to the systematic review field been addressed; and what challenges to implementation have emerged?
We answered these questions using standard systematic review methods: systematic and exhaustive searching, quality-assured data extraction and a narrative synthesis to synthesise findings.

Results

The evidence base is active and diverse; there is almost no replication between studies or collaboration between research teams and, whilst it is difficult to establish any overall conclusions about best approaches, it is clear that efficiencies and reductions in workload are potentially achievable.
On the whole, most suggested that a saving in workload of between 30% and 70% might be possible, though sometimes the saving in workload is accompanied by the loss of 5% of relevant studies (i.e. a 95% recall).

Conclusions

Using text mining to prioritise the order in which items are screened should be considered safe and ready for use in ‘live’ reviews. The use of text mining as a ‘second screener’ may also be used cautiously. The use of text mining to eliminate studies automatically should be considered promising, but not yet fully proven. In highly technical/clinical areas, it may be used with a high degree of confidence; but more developmental and evaluative work is needed in other disciplines.
Appendix
Available only for authorised users
Literature
1.
go back to reference Gough D, Elbourne D: Systematic research synthesis to inform policy, practice and democratic debate.Soc Policy Soc 2002, 1:225–36.CrossRef Gough D, Elbourne D: Systematic research synthesis to inform policy, practice and democratic debate.Soc Policy Soc 2002, 1:225–36.CrossRef
2.
go back to reference Gough D, Oliver S, Thomas J: An Introduction to Systematic Reviews. London: Sage; 2012. Gough D, Oliver S, Thomas J: An Introduction to Systematic Reviews. London: Sage; 2012.
3.
go back to reference Gough D, Thomas J, Oliver S: Clarifying differences between review designs and methods.Syst Rev 2012.,1(28): doi:10.1186/2046–4053–1-28 Gough D, Thomas J, Oliver S: Clarifying differences between review designs and methods.Syst Rev 2012.,1(28): doi:10.1186/2046–4053–1-28
4.
go back to reference Chalmers I, Hedges L, Cooper H: A brief history of research synthesis.Eval Health Prof 2002, 25:12–37. 10.1177/0163278702025001003CrossRefPubMed Chalmers I, Hedges L, Cooper H: A brief history of research synthesis.Eval Health Prof 2002, 25:12–37. 10.1177/0163278702025001003CrossRefPubMed
6.
go back to reference Bastian H, Glasziou P, Chalmers I: Seventy-five trials and eleven systematic reviews a day: how will we ever keep up?PLoS Med 2010.,7(9): Bastian H, Glasziou P, Chalmers I: Seventy-five trials and eleven systematic reviews a day: how will we ever keep up?PLoS Med 2010.,7(9):
7.
go back to reference Lefebvre C, Manheimer E, Glanville J: Searching for studies (chapter 6). In Cochrane Handbook for Systematic Reviews of Interventions Version 510 [updated March 2011]. Edited by: Higgins J, Green S. Oxford: The Cochrane Collaboration; 2011. Lefebvre C, Manheimer E, Glanville J: Searching for studies (chapter 6). In Cochrane Handbook for Systematic Reviews of Interventions Version 510 [updated March 2011]. Edited by: Higgins J, Green S. Oxford: The Cochrane Collaboration; 2011.
8.
go back to reference Gomersall A, Cooper C: Database selection bias and its affect on systematic reviews: a United Kingdom perspective. In Joint Colloquium of the Cochrane and Campbell Collaborations. Keystone, Colorado: The Campbell Collaboration; 2010. Gomersall A, Cooper C: Database selection bias and its affect on systematic reviews: a United Kingdom perspective. In Joint Colloquium of the Cochrane and Campbell Collaborations. Keystone, Colorado: The Campbell Collaboration; 2010.
9.
go back to reference Harden A, Peersman G, Oliver S, Oakley A: Identifying primary research on electronic databases to inform decision-making in health promotion: the case of sexual health promotion.Health Educ J 1999, 58:290–301. 10.1177/001789699905800310CrossRef Harden A, Peersman G, Oliver S, Oakley A: Identifying primary research on electronic databases to inform decision-making in health promotion: the case of sexual health promotion.Health Educ J 1999, 58:290–301. 10.1177/001789699905800310CrossRef
10.
go back to reference Sampson M, Barrowman N, Moher D, Clifford T, Platt R, Morrison A, et al.: Can electronic search engines optimize screening of search results in systematic reviews: an empirical study.BMC Med Res Methodol 2006.,6(7): Sampson M, Barrowman N, Moher D, Clifford T, Platt R, Morrison A, et al.: Can electronic search engines optimize screening of search results in systematic reviews: an empirical study.BMC Med Res Methodol 2006.,6(7):
11.
go back to reference Wallace B, Trikalinos T, Lau J, Brodley C, Schmid C: Semi-automated screening of biomedical citations for systematic reviews.BMC Bioinformatics 2010.,11(55): Wallace B, Trikalinos T, Lau J, Brodley C, Schmid C: Semi-automated screening of biomedical citations for systematic reviews.BMC Bioinformatics 2010.,11(55):
12.
go back to reference Allen I, Olkin I: Estimating time to conduct a meta-analysis from number of citations retrieved.JAMA 1999,282(7):634–5. 10.1001/jama.282.7.634CrossRefPubMed Allen I, Olkin I: Estimating time to conduct a meta-analysis from number of citations retrieved.JAMA 1999,282(7):634–5. 10.1001/jama.282.7.634CrossRefPubMed
13.
go back to reference Felizardo K, Andery G, Paulovich F, Minghim R, Maldonado J: A visual analysis approach to validate the selection review of primary studies in systematic reviews.Inf Softw Technol 2012,54(10):1079–91. 10.1016/j.infsof.2012.04.003CrossRef Felizardo K, Andery G, Paulovich F, Minghim R, Maldonado J: A visual analysis approach to validate the selection review of primary studies in systematic reviews.Inf Softw Technol 2012,54(10):1079–91. 10.1016/j.infsof.2012.04.003CrossRef
14.
go back to reference Malheiros V, Hohn E, Pinho R, Mendonca M: A visual text mining approach for systematic reviews. In Empirical Software Engineering and Measurement, 2007 ESEM 2007 First International Symposium on: 2007 2007. Piscataway: IEEE; 2007:245–54. Malheiros V, Hohn E, Pinho R, Mendonca M: A visual text mining approach for systematic reviews. In Empirical Software Engineering and Measurement, 2007 ESEM 2007 First International Symposium on: 2007 2007. Piscataway: IEEE; 2007:245–54.
15.
go back to reference Miroslav K, Matwin S: Addressing the curse of imbalanced training sets: one-sided selection.Proceedings of the Fourteenth International Conference on Machine Learning: 1997 1997. Miroslav K, Matwin S: Addressing the curse of imbalanced training sets: one-sided selection.Proceedings of the Fourteenth International Conference on Machine Learning: 1997 1997.
16.
go back to reference Watt A, Cameron A, Sturm L, Lathlean T, Babidge W, Blamey S, et al.: Rapid reviews versus full systematic reviews: an inventory of current methods and practice in health technology assessment.Int J Technol Assess Health Care 2008,24(2):133–9.CrossRefPubMed Watt A, Cameron A, Sturm L, Lathlean T, Babidge W, Blamey S, et al.: Rapid reviews versus full systematic reviews: an inventory of current methods and practice in health technology assessment.Int J Technol Assess Health Care 2008,24(2):133–9.CrossRefPubMed
17.
go back to reference Ananiadou S, McNaught J: Text Mining for Biology and Biomedicine. Boston/London: Artech House; 2006. Ananiadou S, McNaught J: Text Mining for Biology and Biomedicine. Boston/London: Artech House; 2006.
18.
go back to reference Hearst M: Untangling Text Data Mining.Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL 1999): 1999 1999, 3–10. Hearst M: Untangling Text Data Mining.Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL 1999): 1999 1999, 3–10.
19.
go back to reference Thomas J, McNaught J, Ananiadou S: Applications of text mining within systematic reviews.Res Synth Methods 2011,2(1):1–14. 10.1002/jrsm.27CrossRefPubMed Thomas J, McNaught J, Ananiadou S: Applications of text mining within systematic reviews.Res Synth Methods 2011,2(1):1–14. 10.1002/jrsm.27CrossRefPubMed
20.
go back to reference Ananiadou S, Okazaki N, Procter R, Rea B, Sasaki Y, Thomas J: Supporting systematic reviews using text mining.Soc Sci Comput Rev 2009, 27:509–23. 10.1177/0894439309332293CrossRef Ananiadou S, Okazaki N, Procter R, Rea B, Sasaki Y, Thomas J: Supporting systematic reviews using text mining.Soc Sci Comput Rev 2009, 27:509–23. 10.1177/0894439309332293CrossRef
21.
go back to reference Thomas J: Diffusion of innovation in systematic review methodology: why is study selection not yet assisted by automation?OA Evid Based Med 2013,1(2):12.CrossRef Thomas J: Diffusion of innovation in systematic review methodology: why is study selection not yet assisted by automation?OA Evid Based Med 2013,1(2):12.CrossRef
22.
go back to reference Thomas J, Brunton J, Graziosi S: EPPI-Reviewer 4.0: Software for Research Synthesis. London: EPPI-Centre Software, Social Science Research Unit, Institute of Education; 2010. Thomas J, Brunton J, Graziosi S: EPPI-Reviewer 4.0: Software for Research Synthesis. London: EPPI-Centre Software, Social Science Research Unit, Institute of Education; 2010.
23.
24.
go back to reference Frunza O, Inkpen D, Matwin S: Building systematic reviews using automatic text classification techniques. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters: 2010 2010. Beijing China: Association for Computational Linguistics; 2010:303–11. Frunza O, Inkpen D, Matwin S: Building systematic reviews using automatic text classification techniques. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters: 2010 2010. Beijing China: Association for Computational Linguistics; 2010:303–11.
25.
go back to reference Wallace B, Small K, Brodley C, Trikalinos T: Active learning for biomedical citation screening.KDD 2010; Washington USA 2010. Wallace B, Small K, Brodley C, Trikalinos T: Active learning for biomedical citation screening.KDD 2010; Washington USA 2010.
26.
go back to reference Lavoie M, Verbeek J: Devices for preventing percutaneous exposure injuries caused by needles in healthcare personnel.Cochrane Database Syst Rev 2014.,2014(3): Lavoie M, Verbeek J: Devices for preventing percutaneous exposure injuries caused by needles in healthcare personnel.Cochrane Database Syst Rev 2014.,2014(3):
27.
go back to reference Mischke C, Verbeek J, Saarto A, Lavoie MC, Pahwa M, Ijaz S: Gloves, extra gloves or special types of gloves for preventing percutaneous exposure injuries in healthcare personnel.Cochrane Database Syst Rev 2014.,2014(3): Mischke C, Verbeek J, Saarto A, Lavoie MC, Pahwa M, Ijaz S: Gloves, extra gloves or special types of gloves for preventing percutaneous exposure injuries in healthcare personnel.Cochrane Database Syst Rev 2014.,2014(3):
28.
go back to reference Martin A, Saunders D, Shenkin S, Sproule J: Lifestyle intervention for improving school achievement in overweight or obese children and adolescents.Cochrane Database Syst Rev 2014.,2014(3): Martin A, Saunders D, Shenkin S, Sproule J: Lifestyle intervention for improving school achievement in overweight or obese children and adolescents.Cochrane Database Syst Rev 2014.,2014(3):
29.
go back to reference Fletcher-Watson S, McConnell F, Manola E, McConachie H: Interventions based on the Theory of Mind cognitive model for autism spectrum disorder (ASD).Cochrane Database Syst Rev 2014.,2014(3): Fletcher-Watson S, McConnell F, Manola E, McConachie H: Interventions based on the Theory of Mind cognitive model for autism spectrum disorder (ASD).Cochrane Database Syst Rev 2014.,2014(3):
30.
go back to reference Bekhuis T, Demner-Fushman D: Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers.Artif Intell Med 2012,55(3):197–207. 10.1016/j.artmed.2012.05.002CrossRefPubMedPubMedCentral Bekhuis T, Demner-Fushman D: Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers.Artif Intell Med 2012,55(3):197–207. 10.1016/j.artmed.2012.05.002CrossRefPubMedPubMedCentral
31.
go back to reference Shemilt I, Simon A, Hollands G, Marteau T, Ogilvie D, O’Mara-Eves A, et al.: Pinpointing needles in giant haystacks: use of text mining to reduce impractical screening workload in extremely large scoping reviews.Res Synth Methods 2013, 13:1218. n/a-n/a Shemilt I, Simon A, Hollands G, Marteau T, Ogilvie D, O’Mara-Eves A, et al.: Pinpointing needles in giant haystacks: use of text mining to reduce impractical screening workload in extremely large scoping reviews.Res Synth Methods 2013, 13:1218. n/a-n/a
32.
go back to reference Hammerstrøm K, Wade A, Jørgensen A: Searching for Studies: A Guide to Information Retrieval for Campbell Systematic Reviews. Keystone, Colorado: Campbell Collaboration; 2010. Hammerstrøm K, Wade A, Jørgensen A: Searching for Studies: A Guide to Information Retrieval for Campbell Systematic Reviews. Keystone, Colorado: Campbell Collaboration; 2010.
33.
go back to reference Institute of Medicine of the National Academies: Finding what works in health care: standards for systematic reviews. Washington, DC: Institute of Medicine of the National Academies; 2011. Institute of Medicine of the National Academies: Finding what works in health care: standards for systematic reviews. Washington, DC: Institute of Medicine of the National Academies; 2011.
34.
go back to reference Cohen A: Performance of support-vector-machine-based classification on 15 systematic review topics evaluated with the WSS@95 measure.J Am Med Inform Assoc 2011, 18:104-–4.CrossRefPubMed Cohen A: Performance of support-vector-machine-based classification on 15 systematic review topics evaluated with the WSS@95 measure.J Am Med Inform Assoc 2011, 18:104-–4.CrossRefPubMed
35.
go back to reference Cohen A, Ambert K, McDonagh M: A prospective evaluation of an automated classification system to support evidence-based medicine and systematic review.AMIA Annual Symposium 2010, 121–5. Cohen A, Ambert K, McDonagh M: A prospective evaluation of an automated classification system to support evidence-based medicine and systematic review.AMIA Annual Symposium 2010, 121–5.
36.
go back to reference Cohen A, Hersh W, Peterson K, Yen P-Y: Reducing workload in systematic review preparation using automated citation classification.J Am Med Inform Assoc 2006,13(2):206–19. 10.1197/jamia.M1929CrossRefPubMedPubMedCentral Cohen A, Hersh W, Peterson K, Yen P-Y: Reducing workload in systematic review preparation using automated citation classification.J Am Med Inform Assoc 2006,13(2):206–19. 10.1197/jamia.M1929CrossRefPubMedPubMedCentral
37.
go back to reference Cohen A: An effective general purpose approach for automated biomedical document classification. In AMIA Annual Symposium Proceedings, vol. 13. Washington, DC: American Medical Informatics Association; 2006:206–19. Cohen A: An effective general purpose approach for automated biomedical document classification. In AMIA Annual Symposium Proceedings, vol. 13. Washington, DC: American Medical Informatics Association; 2006:206–19.
38.
go back to reference Fiszman M, Ortiz E, Bray BE, Rindflesch TC: Semantic Processing to Support Clinical Guideline Development.AMIA 2008 Symposium Proceedings: 2008 2008 2008, 187–91. Fiszman M, Ortiz E, Bray BE, Rindflesch TC: Semantic Processing to Support Clinical Guideline Development.AMIA 2008 Symposium Proceedings: 2008 2008 2008, 187–91.
39.
go back to reference Kim S, Choi J: Improving the performance of text categorization models used for the selection of high quality articles.Healthc Informatics Res 2012,18(1):18–28. 10.4258/hir.2012.18.1.18CrossRef Kim S, Choi J: Improving the performance of text categorization models used for the selection of high quality articles.Healthc Informatics Res 2012,18(1):18–28. 10.4258/hir.2012.18.1.18CrossRef
40.
go back to reference Ma Y: Text Classification on Imbalanced Data: Application to Systematic Reviews Automation. Ottawa: University of Ottawa; 2007. Ma Y: Text Classification on Imbalanced Data: Application to Systematic Reviews Automation. Ottawa: University of Ottawa; 2007.
41.
go back to reference Matwin S, Kouznetsov A, Inkpen D, Frunza O, O’Blenis P: A new algorithm for reducing the workload of experts in performing systematic reviews.J Am Med Inform Assoc 2010,17(4):446–53. 10.1136/jamia.2010.004325CrossRefPubMedPubMedCentral Matwin S, Kouznetsov A, Inkpen D, Frunza O, O’Blenis P: A new algorithm for reducing the workload of experts in performing systematic reviews.J Am Med Inform Assoc 2010,17(4):446–53. 10.1136/jamia.2010.004325CrossRefPubMedPubMedCentral
42.
go back to reference Matwin S, Kouznetsov A, Inkpen D, Frunza O, O’Blenis P: Performance of SVM and Bayesian classifiers on the systematic review classification task.J Am Med Inform Assoc 2011, 18:104–5.CrossRef Matwin S, Kouznetsov A, Inkpen D, Frunza O, O’Blenis P: Performance of SVM and Bayesian classifiers on the systematic review classification task.J Am Med Inform Assoc 2011, 18:104–5.CrossRef
44.
go back to reference Razavi A, Matwin S, Inkpen D, Kouznetsov A: Parameterized Contrast in Second Order Soft Co-Occurrences: A Novel Text Representation Technique in Text Mining and Knowledge Extraction. In 2009 Ieee International Conference on Data Mining Workshops: 2009 2009. New York: Ieee; 2009:471–6.CrossRef Razavi A, Matwin S, Inkpen D, Kouznetsov A: Parameterized Contrast in Second Order Soft Co-Occurrences: A Novel Text Representation Technique in Text Mining and Knowledge Extraction. In 2009 Ieee International Conference on Data Mining Workshops: 2009 2009. New York: Ieee; 2009:471–6.CrossRef
45.
go back to reference Miwa M, Thomas J, O’Mara-Eves A, Ananiadou S: Reducing systematic review workload through certainty-based screening.J Biomed Inform 2014, 51:242–53. doi:10.1016/j.jbi.2014.06.005CrossRefPubMedPubMedCentral Miwa M, Thomas J, O’Mara-Eves A, Ananiadou S: Reducing systematic review workload through certainty-based screening.J Biomed Inform 2014, 51:242–53. doi:10.1016/j.jbi.2014.06.005CrossRefPubMedPubMedCentral
46.
go back to reference Sun Y, Yang Y, Zhang H, Zhang W, Wang Q: Towards evidence-based ontology for supporting Systematic Literature Review. In Proceedings of the EASE Conference 2012: 2012 2012. Ciudad Real Spain: IET; 2012. Sun Y, Yang Y, Zhang H, Zhang W, Wang Q: Towards evidence-based ontology for supporting Systematic Literature Review. In Proceedings of the EASE Conference 2012: 2012 2012. Ciudad Real Spain: IET; 2012.
47.
go back to reference Tomassetti F, Rizzo G, Vetro A, Ardito L, Torchiano M, Morisio M: Linked data approach for selection process automation in systematic reviews.Evaluation & Assessment in Software Engineering (EASE 2011), 15th Annual Conference on: 2011 2011; Durham 2011, 31–5.CrossRef Tomassetti F, Rizzo G, Vetro A, Ardito L, Torchiano M, Morisio M: Linked data approach for selection process automation in systematic reviews.Evaluation & Assessment in Software Engineering (EASE 2011), 15th Annual Conference on: 2011 2011; Durham 2011, 31–5.CrossRef
48.
go back to reference Wallace B, Small K, Brodley C, Lau J, Schmid C, Bertram L, et al.: Toward modernizing the systematic review pipeline in genetics: efficient updating via data mining.Genet Med 2012, 14:663–9. 10.1038/gim.2012.7CrossRefPubMedPubMedCentral Wallace B, Small K, Brodley C, Lau J, Schmid C, Bertram L, et al.: Toward modernizing the systematic review pipeline in genetics: efficient updating via data mining.Genet Med 2012, 14:663–9. 10.1038/gim.2012.7CrossRefPubMedPubMedCentral
49.
go back to reference Wallace B, Small K, Brodley C, Lau J, Trikalinos T: Modeling Annotation Time to Reduce Workload in Comparative Effectiveness Reviews.Proc ACM International Health Informatics Symposium: 2010 2010 2010, 28–35. Wallace B, Small K, Brodley C, Lau J, Trikalinos T: Modeling Annotation Time to Reduce Workload in Comparative Effectiveness Reviews.Proc ACM International Health Informatics Symposium: 2010 2010 2010, 28–35.
50.
go back to reference Wallace B, Small K, Brodley C, Lau J, Trikalinos T: Deploying an interactive machine learning system in an evidence-based practice center: abstrackr. In Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium: 2012. New York: ACM; 2012:819–24.CrossRef Wallace B, Small K, Brodley C, Lau J, Trikalinos T: Deploying an interactive machine learning system in an evidence-based practice center: abstrackr. In Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium: 2012. New York: ACM; 2012:819–24.CrossRef
51.
go back to reference Yu W, Clyne M, Dolan S, Yesupriya A, Wulf A, Liu T, et al.: GAPscreener: an automatic tool for screening human genetic association literature in PubMed using the support vector machine technique.BMC Bioinformatics 2008.,205(9): Yu W, Clyne M, Dolan S, Yesupriya A, Wulf A, Liu T, et al.: GAPscreener: an automatic tool for screening human genetic association literature in PubMed using the support vector machine technique.BMC Bioinformatics 2008.,205(9):
52.
go back to reference Choi S, Ryu B, Yoo S, Choi J: Combining relevancy and methodological quality into a single ranking for evidence-based medicine.Inf Sci 2012, 214:76–90.CrossRef Choi S, Ryu B, Yoo S, Choi J: Combining relevancy and methodological quality into a single ranking for evidence-based medicine.Inf Sci 2012, 214:76–90.CrossRef
53.
go back to reference Fiszman M, Bray BE, Shina D, Kilicoglu H, Bennett GC, Bodenreider O, et al.: Combining relevance assignment with quality of the evidence to support guideline development.Stud Health Technol Inform 2010,160(1):709–13.PubMedPubMedCentral Fiszman M, Bray BE, Shina D, Kilicoglu H, Bennett GC, Bodenreider O, et al.: Combining relevance assignment with quality of the evidence to support guideline development.Stud Health Technol Inform 2010,160(1):709–13.PubMedPubMedCentral
54.
go back to reference Kouznetsov A, Japkowicz N: Using classifier performance visualization to improve collective ranking techniques for biomedical abstracts classification. In Advances in Artificial Intelligence, Proceedings: 2010. Berlin: Springer-Verlag Berlin; 2010:299–303.CrossRef Kouznetsov A, Japkowicz N: Using classifier performance visualization to improve collective ranking techniques for biomedical abstracts classification. In Advances in Artificial Intelligence, Proceedings: 2010. Berlin: Springer-Verlag Berlin; 2010:299–303.CrossRef
55.
go back to reference Kouznetsov A, Matwin S, Inkpen D, Razavi A, Frunza O, Sehatkar M, et al.: Classifying biomedical abstracts using committees of classifiers and collective ranking techniques. In Advances in Artificial Intelligence, Proceedings: 2009. Berlin: Springer-Verlag Berlin; 2009:224–8.CrossRef Kouznetsov A, Matwin S, Inkpen D, Razavi A, Frunza O, Sehatkar M, et al.: Classifying biomedical abstracts using committees of classifiers and collective ranking techniques. In Advances in Artificial Intelligence, Proceedings: 2009. Berlin: Springer-Verlag Berlin; 2009:224–8.CrossRef
56.
go back to reference Martinez D, Karimi S, Cavedon L, Baldwin T: Facilitating biomedical systematic reviews using ranked text retrieval and classification.Proceedings of the 13th Australasian Document Computing Symposium: 2008; Hobart Australia 2008, 53. Martinez D, Karimi S, Cavedon L, Baldwin T: Facilitating biomedical systematic reviews using ranked text retrieval and classification.Proceedings of the 13th Australasian Document Computing Symposium: 2008; Hobart Australia 2008, 53.
57.
go back to reference Thomas J, O’Mara A: How can we find relevant research more quickly? In NCRM MethodsNews. UK: NCRM; 2011:3. Thomas J, O’Mara A: How can we find relevant research more quickly? In NCRM MethodsNews. UK: NCRM; 2011:3.
58.
go back to reference Wallace B, Small K, Brodley C, Trikalinos T: Who should label what? Instance allocation in multiple expert active learning.Proc SIAM International Conference on Data Mining: 2011 2011, 176–87.CrossRef Wallace B, Small K, Brodley C, Trikalinos T: Who should label what? Instance allocation in multiple expert active learning.Proc SIAM International Conference on Data Mining: 2011 2011, 176–87.CrossRef
59.
go back to reference Bekhuis T, Demner-Fushman D: Towards automating the initial screening phase of a systematic review.Stud Health Technol Inform 2010,160(1):146–50.PubMed Bekhuis T, Demner-Fushman D: Towards automating the initial screening phase of a systematic review.Stud Health Technol Inform 2010,160(1):146–50.PubMed
60.
go back to reference Bekhuis T, Tseytlin E, Mitchell K, Demner-Fushman D: Feature engineering and a proposed decision-support system for systematic reviewers of medical evidence.PLoS One 2014,9(1):e86277. 10.1371/journal.pone.0086277CrossRefPubMedPubMedCentral Bekhuis T, Tseytlin E, Mitchell K, Demner-Fushman D: Feature engineering and a proposed decision-support system for systematic reviewers of medical evidence.PLoS One 2014,9(1):e86277. 10.1371/journal.pone.0086277CrossRefPubMedPubMedCentral
61.
go back to reference Frunza O, Inkpen D, Matwin S, Klement W, O’Blenis P: Exploiting the systematic review protocol for classification of medical abstracts.Artif Intell Med 2011,51(1):17–25. 10.1016/j.artmed.2010.10.005CrossRefPubMed Frunza O, Inkpen D, Matwin S, Klement W, O’Blenis P: Exploiting the systematic review protocol for classification of medical abstracts.Artif Intell Med 2011,51(1):17–25. 10.1016/j.artmed.2010.10.005CrossRefPubMed
62.
go back to reference García Adevaa J, Pikatza-Atxa J, Ubeda-Carrillo M, Ansuategi-Zengotitabengoa E: Automatic text classification to support systematic reviews in medicine.Expert Syst Appl 2014,41(4):1498–508. 10.1016/j.eswa.2013.08.047CrossRef García Adevaa J, Pikatza-Atxa J, Ubeda-Carrillo M, Ansuategi-Zengotitabengoa E: Automatic text classification to support systematic reviews in medicine.Expert Syst Appl 2014,41(4):1498–508. 10.1016/j.eswa.2013.08.047CrossRef
64.
go back to reference Felizardo K, Salleh N, Martins R, Mendes E, MacDonell S, Maldonado J: Using visual text mining to support the study selection activity in systematic literature reviews.Empirical Software Engineering and Measurement (ESEM), 2011 International Symposium on: 2011; Banff 2011, 77–86.CrossRef Felizardo K, Salleh N, Martins R, Mendes E, MacDonell S, Maldonado J: Using visual text mining to support the study selection activity in systematic literature reviews.Empirical Software Engineering and Measurement (ESEM), 2011 International Symposium on: 2011; Banff 2011, 77–86.CrossRef
65.
go back to reference Felizardo R, Souza S, Maldonado J: The use of visual text mining to support the study selection activity in systematic literature reviews: a replication study.Replication in Empirical Software Engineering Research (RESER), 2013 3rd International Workshop on: 2013; Baltimore 2013, 91–100.CrossRef Felizardo R, Souza S, Maldonado J: The use of visual text mining to support the study selection activity in systematic literature reviews: a replication study.Replication in Empirical Software Engineering Research (RESER), 2013 3rd International Workshop on: 2013; Baltimore 2013, 91–100.CrossRef
66.
go back to reference Cohen A, Ambert K, McDonagh M: Cross-topic learning for work prioritization in systematic review creation and update.J Am Med Inform Assoc 2009, 16:690–704. 10.1197/jamia.M3162CrossRefPubMedPubMedCentral Cohen A, Ambert K, McDonagh M: Cross-topic learning for work prioritization in systematic review creation and update.J Am Med Inform Assoc 2009, 16:690–704. 10.1197/jamia.M3162CrossRefPubMedPubMedCentral
67.
go back to reference Brunton G, Caird J, Sutcliffe K, Rees R, Stokes G, Stansfield C, et al.: Depression, Anxiety, Pain and Quality of Life in People Living with Chronic Hepatitis C: A Systematic Review and Meta-Analysis. London: EPPI Centre, Social Science Research Unit, Institute of Education, University of London; 2014. Brunton G, Caird J, Sutcliffe K, Rees R, Stokes G, Stansfield C, et al.: Depression, Anxiety, Pain and Quality of Life in People Living with Chronic Hepatitis C: A Systematic Review and Meta-Analysis. London: EPPI Centre, Social Science Research Unit, Institute of Education, University of London; 2014.
68.
go back to reference Cohen A: Optimizing feature representation for automated systematic review work prioritization.AMIA Annual Symposium Proceedings: 2008 2008, 121–5. Cohen A: Optimizing feature representation for automated systematic review work prioritization.AMIA Annual Symposium Proceedings: 2008 2008, 121–5.
69.
go back to reference Cohen A, Ambert K, McDonagh M: Studying the potential impact of automated document classification on scheduling a systematic review update.BMC Med Inform Decis Mak 2012,12(1):33. 10.1186/1472-6947-12-33CrossRefPubMedPubMedCentral Cohen A, Ambert K, McDonagh M: Studying the potential impact of automated document classification on scheduling a systematic review update.BMC Med Inform Decis Mak 2012,12(1):33. 10.1186/1472-6947-12-33CrossRefPubMedPubMedCentral
70.
go back to reference Dalal S, Shekelle P, Hempel S, Newberry S, Motala A, Shetty K: A pilot study using machine learning and domain knowledge to facilitate comparative effectiveness review updating.Med Decis Making 2013,33(3):343–55. 10.1177/0272989X12457243CrossRefPubMed Dalal S, Shekelle P, Hempel S, Newberry S, Motala A, Shetty K: A pilot study using machine learning and domain knowledge to facilitate comparative effectiveness review updating.Med Decis Making 2013,33(3):343–55. 10.1177/0272989X12457243CrossRefPubMed
71.
go back to reference Small K, Wallace B, Brodley C, Trikalinos T: The constrained weight space SVM: learning with ranked features. In Proceedings of the 28th International Conference on Machine Learning. Bellevue, WA, USA: ICML; 2011. Small K, Wallace B, Brodley C, Trikalinos T: The constrained weight space SVM: learning with ranked features. In Proceedings of the 28th International Conference on Machine Learning. Bellevue, WA, USA: ICML; 2011.
72.
go back to reference Sampson M, Tetzlaff J, Urquhart C: Precision of healthcare systematic review searches in a cross-sectional sample.Res Synth Methods 2011, 2:119–25. 10.1002/jrsm.42CrossRefPubMed Sampson M, Tetzlaff J, Urquhart C: Precision of healthcare systematic review searches in a cross-sectional sample.Res Synth Methods 2011, 2:119–25. 10.1002/jrsm.42CrossRefPubMed
73.
go back to reference Sasaki Y: Automatic text classification.University of Manchester: presentation 2008. Sasaki Y: Automatic text classification.University of Manchester: presentation 2008.
74.
go back to reference Tomek I: Two modifications of CNN.IEEE Trans Syst Man Cybern 1976,SMC-6(11):769–72.CrossRef Tomek I: Two modifications of CNN.IEEE Trans Syst Man Cybern 1976,SMC-6(11):769–72.CrossRef
75.
go back to reference Brinker K: Incorporating diversity in active learning with support vector machines. In Proceedings of the 20th International Conference on Machine Learning: 2003. Palo Alto: AAAI Press; 2003:59–66. Brinker K: Incorporating diversity in active learning with support vector machines. In Proceedings of the 20th International Conference on Machine Learning: 2003. Palo Alto: AAAI Press; 2003:59–66.
76.
go back to reference Gama J, Žliobaitė A, Bifet A, Pechenizkiy M, Bouchachia A: A survey on concept drift adaptation.ACM Comput Surv (CSUR) 2014,46(4):44.CrossRef Gama J, Žliobaitė A, Bifet A, Pechenizkiy M, Bouchachia A: A survey on concept drift adaptation.ACM Comput Surv (CSUR) 2014,46(4):44.CrossRef
77.
go back to reference Pan S, Qiang Y: A survey on transfer learning. Knowledge and Data Engineering.IEEE Trans Syst Man Cybern 2010,22(10):1345–59. Pan S, Qiang Y: A survey on transfer learning. Knowledge and Data Engineering.IEEE Trans Syst Man Cybern 2010,22(10):1345–59.
78.
go back to reference Davis J, Goadrich M: The relationship between Precision-Recall and ROC curves. In ICML '06 Proceedings of the 23rd international conference on Machine learning 2006. New York, NY, USA: ACM; 2006. Davis J, Goadrich M: The relationship between Precision-Recall and ROC curves. In ICML '06 Proceedings of the 23rd international conference on Machine learning 2006. New York, NY, USA: ACM; 2006.
79.
go back to reference García V, Mollineda R, Sánchez J: A bias correction function for classification performance assessment in two-class imbalanced problems.Knowl Based Syst 2014, 59:66–74.CrossRef García V, Mollineda R, Sánchez J: A bias correction function for classification performance assessment in two-class imbalanced problems.Knowl Based Syst 2014, 59:66–74.CrossRef
80.
go back to reference Tsafnat G, Dunn A, Glasziou P, Coiera E: The automation of systematic reviews.BMJ 2013.,346(f139): Tsafnat G, Dunn A, Glasziou P, Coiera E: The automation of systematic reviews.BMJ 2013.,346(f139):
81.
go back to reference Settles B: Active Learning Literature Survey. Computer Sciences Technical Report 1648. Wisconsin: University of Wisconsin–Madison; 2009. Settles B: Active Learning Literature Survey. Computer Sciences Technical Report 1648. Wisconsin: University of Wisconsin–Madison; 2009.
82.
go back to reference Sarveniazi A: An actual survey of dimensionality reduction.Am J Comput Math 2014, 4:55–72. 10.4236/ajcm.2014.42006CrossRef Sarveniazi A: An actual survey of dimensionality reduction.Am J Comput Math 2014, 4:55–72. 10.4236/ajcm.2014.42006CrossRef
83.
go back to reference Elkan C: The foundations of cost-sensitive learning. In International Joint Conference on Artificial Intelligence: 2001. Seattle, Washington: Morgan Kaufmann Publishers Inc; 2001. Elkan C: The foundations of cost-sensitive learning. In International Joint Conference on Artificial Intelligence: 2001. Seattle, Washington: Morgan Kaufmann Publishers Inc; 2001.
84.
go back to reference Cao P, Zhao D, Zaiane O: An optimized cost-sensitive SVM for imbalanced data learning. In Advances in Knowledge Discovery and Data Mining: 2013. Berlin Heidelberg: Springer; 2013:280–92.CrossRef Cao P, Zhao D, Zaiane O: An optimized cost-sensitive SVM for imbalanced data learning. In Advances in Knowledge Discovery and Data Mining: 2013. Berlin Heidelberg: Springer; 2013:280–92.CrossRef
85.
go back to reference Margineantu D: Active cost-sensitive learning. In Proceedings of the 19th International Joint Conference on Artificial Intelligence: 2005. Burlington: Morgan Kaufmann Publishers Inc; 2005. Margineantu D: Active cost-sensitive learning. In Proceedings of the 19th International Joint Conference on Artificial Intelligence: 2005. Burlington: Morgan Kaufmann Publishers Inc; 2005.
86.
go back to reference Blake C: Beyond genes, proteins, and abstracts: identifying scientific claims from full-text biomedical articles.J Biomed Inform 2010, 43:173–89. 10.1016/j.jbi.2009.11.001CrossRefPubMed Blake C: Beyond genes, proteins, and abstracts: identifying scientific claims from full-text biomedical articles.J Biomed Inform 2010, 43:173–89. 10.1016/j.jbi.2009.11.001CrossRefPubMed
87.
go back to reference Cohen K, Johnson H, Verspoor K, Roeder C, Hunter L: The structural and content aspects of abstracts versus bodies of full text journal articles are different.BMC Bioinformatics 2010.,11(492): Cohen K, Johnson H, Verspoor K, Roeder C, Hunter L: The structural and content aspects of abstracts versus bodies of full text journal articles are different.BMC Bioinformatics 2010.,11(492):
88.
go back to reference Truyens M, Van Eecke P: Legal aspects of text mining.Comput Law Secur Rev 2014,302(2):153–70.CrossRef Truyens M, Van Eecke P: Legal aspects of text mining.Comput Law Secur Rev 2014,302(2):153–70.CrossRef
89.
go back to reference Reichman J, Okediji R: When copyright law and science collide: empowering digitally integrated research methods on a global scale.Minn Law Rev 2012,96(4):1362–480.PubMedPubMedCentral Reichman J, Okediji R: When copyright law and science collide: empowering digitally integrated research methods on a global scale.Minn Law Rev 2012,96(4):1362–480.PubMedPubMedCentral
90.
91.
go back to reference Kiritchenko S, de Bruijn B, Carini S, Martin J, Sim I: ExaCT: automatic extraction of clinical trial characteristics from journal publications.BMC Med Inform Decis Mak 2010,10(1):56. 10.1186/1472-6947-10-56CrossRefPubMedPubMedCentral Kiritchenko S, de Bruijn B, Carini S, Martin J, Sim I: ExaCT: automatic extraction of clinical trial characteristics from journal publications.BMC Med Inform Decis Mak 2010,10(1):56. 10.1186/1472-6947-10-56CrossRefPubMedPubMedCentral
92.
go back to reference Marshall I, Kuiper J, Wallace B: Automating risk of bias assessment for clinical trials. In BCB '14 Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. New York, NY, USA: ACM; 2014:88–95. Marshall I, Kuiper J, Wallace B: Automating risk of bias assessment for clinical trials. In BCB '14 Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. New York, NY, USA: ACM; 2014:88–95.
93.
go back to reference Summerscales R: Automatic Summarization of Clinical Abstracts for Evidence-Based Medicine. Chicago, Illinois: Graduate College of the Illinois Institute of Technology; 2013. Summerscales R: Automatic Summarization of Clinical Abstracts for Evidence-Based Medicine. Chicago, Illinois: Graduate College of the Illinois Institute of Technology; 2013.
Metadata
Title
Using text mining for study identification in systematic reviews: a systematic review of current approaches
Authors
Alison O’Mara-Eves
James Thomas
John McNaught
Makoto Miwa
Sophia Ananiadou
Publication date
01-12-2015
Publisher
BioMed Central
Published in
Systematic Reviews / Issue 1/2015
Electronic ISSN: 2046-4053
DOI
https://doi.org/10.1186/2046-4053-4-5

Other articles of this Issue 1/2015

Systematic Reviews 1/2015 Go to the issue