Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2021

Open Access 01-12-2021 | Research article

An improved BM25 algorithm for clinical decision support in Precision Medicine based on co-word analysis and Cuckoo Search

Author: Zicheng Zhang

Published in: BMC Medical Informatics and Decision Making | Issue 1/2021

Login to get access

Abstract

Background

Retrieving gene and disease information from a vast collection of biomedical abstracts to provide doctors with clinical decision support is one of the important research directions of Precision Medicine.

Method

We propose a novel article retrieval method based on expanded word and co-word analyses, also conducting Cuckoo Search to optimize parameters of the retrieval function. The main goal is to retrieve the abstracts of biomedical articles that refer to treatments. The methods mentioned in this manuscript adopt the BM25 algorithm to calculate the score of abstracts. We, however, propose an improved version of BM25 that computes the scores of expanded words and co-word leading to a composite retrieval function, which is then optimized using the Cuckoo Search. The proposed method aims to find both disease and gene information in the abstract of the same biomedical article. This is to achieve higher relevance and hence score of articles. Besides, we investigate the influence of different parameters on the retrieval algorithm and summarize how they meet various retrieval needs.

Results

The data used in this manuscript is sourced from medical articles presented in Text Retrieval Conference (TREC): Clinical Decision Support (CDS) Tracks of 2017, 2018, and 2019 in Precision Medicine. A total of 120 topics are tested. Three indicators are employed for the comparison of utilized methods, which are selected among the ones based only on the BM25 algorithm and its improved version to conduct comparable experiments. The results showed that the proposed algorithm achieves better results.

Conclusion

The proposed method, an improved version of the BM25 algorithm, utilizes both co-word implementation and Cuckoo Search, which has been verified achieving better results on a large number of experimental sets. Besides, a relatively simple query expansion method is implemented in this manuscript. Future research will focus on ontology and semantic networks to expand the query vocabulary.
Literature
1.
go back to reference Simpson MS, Voorhees EM, Hersh W. Overview of the TREC 2014 clinical decision support track. In: Proceedings of Text Retrieval Conference (TREC); 2014. Simpson MS, Voorhees EM, Hersh W. Overview of the TREC 2014 clinical decision support track. In: Proceedings of Text Retrieval Conference (TREC); 2014.
2.
go back to reference Roberts K, Simpson MS, Voorhees EM, Hersh WR. Overview of the TREC 2015 clinical decision support track. In: Proceedings of Text Retrieval Conference (TREC); (2015). Roberts K, Simpson MS, Voorhees EM, Hersh WR. Overview of the TREC 2015 clinical decision support track. In: Proceedings of Text Retrieval Conference (TREC); (2015).
3.
go back to reference Roberts K, Demner-Fushman D, Voorhees EM, Hersh WR. Overview of the TREC 2016 clinical decision support track. In: Proceedings of Text Retrieval Conference (TREC); 2016. Roberts K, Demner-Fushman D, Voorhees EM, Hersh WR. Overview of the TREC 2016 clinical decision support track. In: Proceedings of Text Retrieval Conference (TREC); 2016.
4.
go back to reference Roberts K, Demner-Fushman D, Voorhees EM, Hersh WR, Bedrick S, Lazar AJ, Pant S. Overview of the TREC 2017 precision medicine track. In: Proceedings of Text Retrieval Conference (TREC); 2017. Roberts K, Demner-Fushman D, Voorhees EM, Hersh WR, Bedrick S, Lazar AJ, Pant S. Overview of the TREC 2017 precision medicine track. In: Proceedings of Text Retrieval Conference (TREC); 2017.
5.
go back to reference Roberts K, Demner-Fushman D, Voorhees EM, Hersh WR, Bedrick S, Lazar SJ. Overview of the TREC 2018 precision medicine track. In: Proceedings of Text Retrieval Conference (TREC); 2018. Roberts K, Demner-Fushman D, Voorhees EM, Hersh WR, Bedrick S, Lazar SJ. Overview of the TREC 2018 precision medicine track. In: Proceedings of Text Retrieval Conference (TREC); 2018.
6.
go back to reference Roberts K, Demner-Fushman D, Voorhees EM, Hersh WR, Bedrick S, Lazar SJ. Overview of the TREC 2019 precision medicine track. In: Proceedings of Text Retrieval Conference (TREC); 2019. Roberts K, Demner-Fushman D, Voorhees EM, Hersh WR, Bedrick S, Lazar SJ. Overview of the TREC 2019 precision medicine track. In: Proceedings of Text Retrieval Conference (TREC); 2019.
7.
go back to reference Collins FS, Varmus H. A new initiative on precision medicine. N Engl J Med. 2015;372(9):793–5.CrossRef Collins FS, Varmus H. A new initiative on precision medicine. N Engl J Med. 2015;372(9):793–5.CrossRef
8.
go back to reference Robertson SE, Walker S, Hancock-Beaulieu M, Gatford M, Payne A. Okapi at TREC-4. In: TREC, 1995. Robertson SE, Walker S, Hancock-Beaulieu M, Gatford M, Payne A. Okapi at TREC-4. In: TREC, 1995.
9.
go back to reference Gey FC. Inferring probability of relevance using the method of logistic regression. In: SIGIR’94. London: Springer; 1994. p. 222–31. Gey FC. Inferring probability of relevance using the method of logistic regression. In: SIGIR’94. London: Springer; 1994. p. 222–31.
10.
go back to reference Joachims T. Optimizing search engines using clickthrough data. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining. ACM; 2002. p. 133–42 Joachims T. Optimizing search engines using clickthrough data. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining. ACM; 2002. p. 133–42
11.
go back to reference Freund Y, Layer R, Schapire RE. An efficient boosting algorithm for combining preferences. J Mach Learn Res. 2003;4(9):933–69. Freund Y, Layer R, Schapire RE. An efficient boosting algorithm for combining preferences. J Mach Learn Res. 2003;4(9):933–69.
12.
go back to reference Cao Z, Qin T, Liu TY. Learning to rank: from pairwise approach to listwise approach. In: Proceedings of the 24th international conference on machine learning. ACM; 2007. p. 129–36. Cao Z, Qin T, Liu TY. Learning to rank: from pairwise approach to listwise approach. In: Proceedings of the 24th international conference on machine learning. ACM; 2007. p. 129–36.
13.
go back to reference Xu J, Li H. Adarank: a boosting algorithm for information retrieval. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM; 2007. p. 391–8. Xu J, Li H. Adarank: a boosting algorithm for information retrieval. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM; 2007. p. 391–8.
14.
go back to reference Burges CJC. From ranknet to lambdarank to lambdamart: an overview. Learning. 2010;11:523–81, 81. Burges CJC. From ranknet to lambdarank to lambdamart: an overview. Learning. 2010;11:523–81, 81.
15.
go back to reference Singh J, Prasad M, Prasad OK. A novel fuzzy logic model for pseudo-relevance feedback-based query expansion. Int J Fuzzy Syst. 2016;18(6):980–9.CrossRef Singh J, Prasad M, Prasad OK. A novel fuzzy logic model for pseudo-relevance feedback-based query expansion. Int J Fuzzy Syst. 2016;18(6):980–9.CrossRef
16.
go back to reference Keikha A, Ensan F, Bagheri E. Query expansion using pseudo relevance feedback on Wikipedia. J Intell Inf Syst. 2018;50(3):455–78.CrossRef Keikha A, Ensan F, Bagheri E. Query expansion using pseudo relevance feedback on Wikipedia. J Intell Inf Syst. 2018;50(3):455–78.CrossRef
17.
go back to reference Almasri M, Berrut C, Chevallet JP. A comparison of deep learning-based query expansion with pseudo-relevance feedback and mutual information. In: Proceedings of European conference on information retrieval padua. ECIR Press; 2016. p. 709–715. Almasri M, Berrut C, Chevallet JP. A comparison of deep learning-based query expansion with pseudo-relevance feedback and mutual information. In: Proceedings of European conference on information retrieval padua. ECIR Press; 2016. p. 709–715.
18.
go back to reference Singh J, Sharan A. A new fuzzy logic-based query expansion model for effificient information retrieval using relevance feedback approach. Neural Comput Appl. 2017;28:2557–80.CrossRef Singh J, Sharan A. A new fuzzy logic-based query expansion model for effificient information retrieval using relevance feedback approach. Neural Comput Appl. 2017;28:2557–80.CrossRef
19.
go back to reference Cui H, Wen JR, Nie JY. Probabilistic query expansion using query logs. In: Proceedings of the 11th international conference on World Wide Web. ACM; 2002. p. 325–332. Cui H, Wen JR, Nie JY. Probabilistic query expansion using query logs. In: Proceedings of the 11th international conference on World Wide Web. ACM; 2002. p. 325–332.
20.
go back to reference Aronson AR, Rindflesch TC. Query expansion using the UMLS Meta Thesaurus. In: Proceedings of the AMIA annual fall symposium. American Medical Informatics Association; 1997. p. 485. Aronson AR, Rindflesch TC. Query expansion using the UMLS Meta Thesaurus. In: Proceedings of the AMIA annual fall symposium. American Medical Informatics Association; 1997. p. 485.
21.
go back to reference Aronson AR. Effective mapping of biomedical text to the UMLS Meta-Thesaurus: the MetaMap program. In: Proceedings of the AMIA symposium. American Medical Informatics Association; 2001. p. 17. Aronson AR. Effective mapping of biomedical text to the UMLS Meta-Thesaurus: the MetaMap program. In: Proceedings of the AMIA symposium. American Medical Informatics Association; 2001. p. 17.
22.
go back to reference Li S, Sun Y, Soergel D. Automatic decision support for clinical diagnostic literature using link analysis in a weighted keyword network. J Med Syst. 2018;42:27.CrossRef Li S, Sun Y, Soergel D. Automatic decision support for clinical diagnostic literature using link analysis in a weighted keyword network. J Med Syst. 2018;42:27.CrossRef
23.
go back to reference Balaneshinkordan S, Kotov A. Bayesian approach to incorporating different types of biomedical knowledge bases into information retrieval systems for clinical decision support in precision medicine. J Biomed Inform. 2019;98:103238.CrossRef Balaneshinkordan S, Kotov A. Bayesian approach to incorporating different types of biomedical knowledge bases into information retrieval systems for clinical decision support in precision medicine. J Biomed Inform. 2019;98:103238.CrossRef
24.
go back to reference Kastner M, Wilczynski NL, Walker-Dilks C, Ann MK, Haynes B. Age-specific search strategies for MedLine. J Med Internet Res. 2006;8(4):1–10.CrossRef Kastner M, Wilczynski NL, Walker-Dilks C, Ann MK, Haynes B. Age-specific search strategies for MedLine. J Med Internet Res. 2006;8(4):1–10.CrossRef
25.
go back to reference Holland JH. Adaptation in natural and artificial systems. Ann Arbor, Michigan Holland JH. Adaptation in natural and artificial systems. Ann Arbor, Michigan
26.
go back to reference Kirkpatrick S, Gelatt CD Jr, Vecchi MP. Optimization by simulated annealing. Science. 1983;220(4598):671–80.CrossRef Kirkpatrick S, Gelatt CD Jr, Vecchi MP. Optimization by simulated annealing. Science. 1983;220(4598):671–80.CrossRef
27.
go back to reference Dorigo M, Gambardella LM. A study of some properties of Ant-Q. In: Proceedings of the 44th international conference on parallel problem solving from nature; 1996. p. 656–665. Dorigo M, Gambardella LM. A study of some properties of Ant-Q. In: Proceedings of the 44th international conference on parallel problem solving from nature; 1996. p. 656–665.
28.
go back to reference Yang XS, Deb S. Cuckoo search via levy flights. In: World congress on nature & biologically inspired computing; 2009. p. 210–214. Yang XS, Deb S. Cuckoo search via levy flights. In: World congress on nature & biologically inspired computing; 2009. p. 210–214.
29.
go back to reference Krishnand KN, Ghose D. Detection of multiple source locations using a glowworm metaphor with applications to collective robotics. In: Proceedings of IEEE swarm intelligence symposium; 2005. p. 84–91. Krishnand KN, Ghose D. Detection of multiple source locations using a glowworm metaphor with applications to collective robotics. In: Proceedings of IEEE swarm intelligence symposium; 2005. p. 84–91.
30.
go back to reference Kenney J, Eberhart R. Particle swarm optimization. In: Proceedings of IEEE conference on neural networks; 1995. Kenney J, Eberhart R. Particle swarm optimization. In: Proceedings of IEEE conference on neural networks; 1995.
31.
go back to reference Guerrero M, Castillo O, Valdez M. Cuckoo Search via Lévy flights and a comparison with genetic algorithms. In: Castillo O, Melin P, editors. Fuzzy logic augmentation of nature-inspired optimization metaheuristics, vol. 574. Cham: Springer; 2015. pp. 91–103. Guerrero M, Castillo O, Valdez M. Cuckoo Search via Lévy flights and a comparison with genetic algorithms. In: Castillo O, Melin P, editors. Fuzzy logic augmentation of nature-inspired optimization metaheuristics, vol. 574. Cham: Springer; 2015. pp. 91–103.
32.
go back to reference Pavlyukevich I. Levy flights, non-local search, and simulated annealing. Comput Phys. 2007;226:1830–44.CrossRef Pavlyukevich I. Levy flights, non-local search, and simulated annealing. Comput Phys. 2007;226:1830–44.CrossRef
33.
go back to reference Pavlyukevich I. Cooling down Levy flights. J Phys A Math Theor. 2007;40:12299–313.CrossRef Pavlyukevich I. Cooling down Levy flights. J Phys A Math Theor. 2007;40:12299–313.CrossRef
34.
go back to reference Wang Y, Komandur-Elayavilli R, Rastegar-Mojarad M. Leveraging both structured and unstructured data for Precision Information Retrieval. In: Proceedings of Text Retrieval Conference (TREC); 2017. Wang Y, Komandur-Elayavilli R, Rastegar-Mojarad M. Leveraging both structured and unstructured data for Precision Information Retrieval. In: Proceedings of Text Retrieval Conference (TREC); 2017.
35.
go back to reference Li C, He B, Sun Y. UCAS at TREC-2017 Precision Medicine Track. In: Proceedings of Text Retrieval Conference (TREC); 2017. Li C, He B, Sun Y. UCAS at TREC-2017 Precision Medicine Track. In: Proceedings of Text Retrieval Conference (TREC); 2017.
36.
go back to reference Jo S-H, Lee K-S. CBNU at TREC 2017 Precision Medicine Track. In: Proceedings of Text Retrieval Conference (TREC); 2017. Jo S-H, Lee K-S. CBNU at TREC 2017 Precision Medicine Track. In: Proceedings of Text Retrieval Conference (TREC); 2017.
37.
go back to reference Wang Y, Fang H. Combining term-based and concept-based representation for clinical retrieval. In: Proceedings of Text Retrieval Conference (TREC); 2017. Wang Y, Fang H. Combining term-based and concept-based representation for clinical retrieval. In: Proceedings of Text Retrieval Conference (TREC); 2017.
38.
go back to reference Ling Y, Hasan SA, Filannino M. A hybrid approach to Precision Medicine-related biomedical article retrieval and clinical trial matching. In: Proceedings of Text Retrieval Conference (TREC); 2017. Ling Y, Hasan SA, Filannino M. A hybrid approach to Precision Medicine-related biomedical article retrieval and clinical trial matching. In: Proceedings of Text Retrieval Conference (TREC); 2017.
39.
go back to reference Noh J., Kavuluru R., Team UKNLP at TREC 2017 Precision Medicine Track: A Knowledge-Based IR System with Tuned Query-Time Boosting.Proceedings of Text Retrieval Conference (TREC), 2017. Noh J., Kavuluru R., Team UKNLP at TREC 2017 Precision Medicine Track: A Knowledge-Based IR System with Tuned Query-Time Boosting.Proceedings of Text Retrieval Conference (TREC), 2017.
40.
go back to reference Baruah P, Dulepet R. Kyle Qian. Brown University at TREC Precision Medicine 2018. In: Proceedings of Text Retrieval Conference (TREC); 2018. Baruah P, Dulepet R. Kyle Qian. Brown University at TREC Precision Medicine 2018. In: Proceedings of Text Retrieval Conference (TREC); 2018.
41.
go back to reference Nishani L, Kolla M., Baruah G., Klick Labs at TREC 2018 Precision Medicine track. In: Proceedings of Text Retrieval Conference (TREC); 2018. Nishani L, Kolla M., Baruah G., Klick Labs at TREC 2018 Precision Medicine track. In: Proceedings of Text Retrieval Conference (TREC); 2018.
42.
go back to reference Zheng Z, Li C, He B. UCAS at TREC-2018 Precision Medicine Track. In: Proceedings of Text Retrieval Conference (TREC); 2018. Zheng Z, Li C, He B. UCAS at TREC-2018 Precision Medicine Track. In: Proceedings of Text Retrieval Conference (TREC); 2018.
43.
go back to reference Taylor S.J., Goodwin T.R., Harabagiu S.B, UTD HLTRI at TREC 2018:Precision Medicine Track.Proceedings of Text Retrieval Conference (TREC), 2018. Taylor S.J., Goodwin T.R., Harabagiu S.B, UTD HLTRI at TREC 2018:Precision Medicine Track.Proceedings of Text Retrieval Conference (TREC), 2018.
44.
go back to reference Jo S-H, Lee K-S. CBNU at TREC 2019 Precision Medicine Track. In: Proceedings of Text Retrieval Conference (TREC); 2019. Jo S-H, Lee K-S. CBNU at TREC 2019 Precision Medicine Track. In: Proceedings of Text Retrieval Conference (TREC); 2019.
45.
go back to reference Zheng Q, Li Y, Hu J. ECNU-ICA team at TREC 2019 Precision Medicine Track. In: Proceedings of Text Retrieval Conference (TREC); 2019. Zheng Q, Li Y, Hu J. ECNU-ICA team at TREC 2019 Precision Medicine Track. In: Proceedings of Text Retrieval Conference (TREC); 2019.
46.
go back to reference Di Nunzio GM, Marchesin S, Agosti M. Exploring how to combine query reformulations for Precision Medicine. In: Proceedings of Text Retrieval Conference (TREC); 2019. Di Nunzio GM, Marchesin S, Agosti M. Exploring how to combine query reformulations for Precision Medicine. In: Proceedings of Text Retrieval Conference (TREC); 2019.
47.
go back to reference Cieslewicz A, Dutkiewicz J, Jedrzejek CL. Poznan contribution to TREC-PM 2019. In: Proceedings of text retrieval conference (TREC); 2019. Cieslewicz A, Dutkiewicz J, Jedrzejek CL. Poznan contribution to TREC-PM 2019. In: Proceedings of text retrieval conference (TREC); 2019.
48.
go back to reference Wu DTY, Su W-C. Retrieving scientific abstracts using venue-and concept-based approaches: CincyMedIR at TREC 2019 Precision Medicine Track. In: Proceedings of Text Retrieval Conference (TREC); 2019. Wu DTY, Su W-C. Retrieving scientific abstracts using venue-and concept-based approaches: CincyMedIR at TREC 2019 Precision Medicine Track. In: Proceedings of Text Retrieval Conference (TREC); 2019.
49.
go back to reference Rybinski M, Karimi S, Paris C. CSIRO at 2019 TREC Precision Medicine Track. In: Proceedings of Text Retrieval Conference (TREC); 2019. Rybinski M, Karimi S, Paris C. CSIRO at 2019 TREC Precision Medicine Track. In: Proceedings of Text Retrieval Conference (TREC); 2019.
50.
go back to reference Trotman A. Choosing document structure weights. Inf Process Manag. 2005;41:243–64.CrossRef Trotman A. Choosing document structure weights. Inf Process Manag. 2005;41:243–64.CrossRef
Metadata
Title
An improved BM25 algorithm for clinical decision support in Precision Medicine based on co-word analysis and Cuckoo Search
Author
Zicheng Zhang
Publication date
01-12-2021
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2021
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/s12911-021-01454-5

Other articles of this Issue 1/2021

BMC Medical Informatics and Decision Making 1/2021 Go to the issue