Skip to main content
Top
Published in: Chinese Medicine 1/2017

Open Access 01-12-2017 | Research

Revealing topics and their evolution in biomedical literature using Bio-DTM: a case study of ginseng

Authors: Qian Chen, Ni Ai, Jie Liao, Xin Shao, Yufeng Liu, Xiaohui Fan

Published in: Chinese Medicine | Issue 1/2017

Login to get access

Abstract

Background

Valuable scientific results on biomedicine are very rich, but they are widely scattered in the literature. Topic modeling enables researchers to discover themes from an unstructured collection of documents without any prior annotations or labels. In this paper, taking ginseng as an example, biological dynamic topic model (Bio-DTM) was proposed to conduct a retrospective study and interpret the temporal evolution of the research of ginseng.

Methods

The system of Bio-DTM mainly includes four components, documents pre-processing, bio-dictionary construction, dynamic topic models, topics analysis and visualization. Scientific articles pertaining to ginseng were retrieved through text mining from PubMed. The bio-dictionary integrates MedTerms medical dictionary, the second edition of side effect resource, a dictionary of biology and HGNC database of human gene names (HGNC). A dynamic topic model, a text mining technique, was used to emphasize on capturing the development trends of topics in a sequentially collected documents. Besides the contents of topics taken on, the evolution of topics was visualized over time using ThemeRiver.

Results

From the topic 9, ginseng was used in dietary supplements and complementary and integrative health practices, and became very popular since the early twentieth century. Topic 6 reminded that the planting of ginseng is a major area of research and symbiosis and allelopathy of ginseng became a research hotspot in 2007. In addition, the Bio-DTM model gave an insight into the main pharmacologic effects of ginseng, such as anti-metabolic disorder effect, cardioprotective effect, anti-cancer effect, hepatoprotective effect, anti-thrombotic effect and neuroprotective effect.

Conclusion

The Bio-DTM model not only discovers what ginseng’s research involving in but also displays how these topics evolving over time. This approach can be applied to the biomedical field to conduct a retrospective study and guide future studies.
Appendix
Available only for authorised users
Literature
1.
go back to reference Rebholz-Schuhmann D, Oellrich A, Hoehndorf R. Text-mining solutions for biomedical research: enabling integrative biology. Nat Rev Genet. 2012;13:829–39.CrossRefPubMed Rebholz-Schuhmann D, Oellrich A, Hoehndorf R. Text-mining solutions for biomedical research: enabling integrative biology. Nat Rev Genet. 2012;13:829–39.CrossRefPubMed
2.
go back to reference Fleuren WW, Alkema W. Application of text mining in the biomedical domain. Methods. 2015;74:97–106.CrossRefPubMed Fleuren WW, Alkema W. Application of text mining in the biomedical domain. Methods. 2015;74:97–106.CrossRefPubMed
4.
go back to reference Wildgaard LE, Wildgaard LE, Lund H, Lund H. Advancing PubMed? A comparison of third-party PubMed/Medline tools. Libr Hi Tech. 2016;34:669–84.CrossRef Wildgaard LE, Wildgaard LE, Lund H, Lund H. Advancing PubMed? A comparison of third-party PubMed/Medline tools. Libr Hi Tech. 2016;34:669–84.CrossRef
5.
go back to reference Allahyari M, Pouriyeh S, Assefi M, Safaei S, Trippe ED, Gutierrez JB, et al. A brief survey of text mining: classification, clustering and extraction techniques. arXiv preprint arXiv:1707.02919. 2017. Allahyari M, Pouriyeh S, Assefi M, Safaei S, Trippe ED, Gutierrez JB, et al. A brief survey of text mining: classification, clustering and extraction techniques. arXiv preprint arXiv:​1707.​02919. 2017.
7.
go back to reference Holzinger A, Schantl J, Schroettner M, Seifert C, Verspoor K. Biomedical text mining: state-of-the-art, open problems and future challenges. In: Interactive knowledge discovery and data mining in biomedical informatics. Berlin: Springer; 2014. p. 271–300. Holzinger A, Schantl J, Schroettner M, Seifert C, Verspoor K. Biomedical text mining: state-of-the-art, open problems and future challenges. In: Interactive knowledge discovery and data mining in biomedical informatics. Berlin: Springer; 2014. p. 271–300.
9.
go back to reference Guo Y, Séaghdha DO, Silins I, Sun L, Högberg J, Stenius U, et al. CRAB 2.0: a text mining tool for supporting literature review in chemical cancer risk assessment. In: COLING (Demos); 2014. p. 76–80. Guo Y, Séaghdha DO, Silins I, Sun L, Högberg J, Stenius U, et al. CRAB 2.0: a text mining tool for supporting literature review in chemical cancer risk assessment. In: COLING (Demos); 2014. p. 76–80.
10.
11.
go back to reference Altena AJ, Moerland PD, Zwinderman AH, Olabarriaga SD. Understanding big data themes from scientific biomedical literature through topic modeling. J Big Data. 2016;3:23.CrossRef Altena AJ, Moerland PD, Zwinderman AH, Olabarriaga SD. Understanding big data themes from scientific biomedical literature through topic modeling. J Big Data. 2016;3:23.CrossRef
12.
13.
go back to reference Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. J Mach Learn Res. 2003;3:993–1022. Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. J Mach Learn Res. 2003;3:993–1022.
14.
go back to reference Rosen-Zvi M, Griffiths T, Steyvers M, Smyth P. The author-topic model for authors and documents. In: Proceedings of the 20th conference on uncertainty in artificial intelligence. Banff, Canada: AUAI Press; 2004. p. 487–94. Rosen-Zvi M, Griffiths T, Steyvers M, Smyth P. The author-topic model for authors and documents. In: Proceedings of the 20th conference on uncertainty in artificial intelligence. Banff, Canada: AUAI Press; 2004. p. 487–94.
15.
go back to reference Blei DM, Lafferty JD. Correlated topic models. In: Advances in neural information processing systems; 2005. p. 113–20. Blei DM, Lafferty JD. Correlated topic models. In: Advances in neural information processing systems; 2005. p. 113–20.
16.
go back to reference Blei DM, Lafferty JD. A correlated topic model of science. Ann Appl Stat. 2007;1:17–35.CrossRef Blei DM, Lafferty JD. A correlated topic model of science. Ann Appl Stat. 2007;1:17–35.CrossRef
17.
go back to reference Ramage D, Hall D, Nallapati R, Manning CD. Labeled LDA. A supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the 2009 conference on empirical methods in natural language processing: Volume 1-Volume 1. Singapore: Association for Computational Linguistics; 2009. p. 248–56. Ramage D, Hall D, Nallapati R, Manning CD. Labeled LDA. A supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the 2009 conference on empirical methods in natural language processing: Volume 1-Volume 1. Singapore: Association for Computational Linguistics; 2009. p. 248–56.
18.
go back to reference Grün B, Hornik K. topicmodels: an R package for fitting topic models. J Stat Softw. 2011;40:1–30.CrossRef Grün B, Hornik K. topicmodels: an R package for fitting topic models. J Stat Softw. 2011;40:1–30.CrossRef
20.
go back to reference Bolelli L, Ertekin Ş, Giles CL. Topic and trend detection in text collections using latent dirichlet allocation. In: Boughanem M, Berrut C, Mothe J, Soule-Dupuy C, editors. Advances in information retrieval. Heidelberg: Springer; 2009. p. 776–80.CrossRef Bolelli L, Ertekin Ş, Giles CL. Topic and trend detection in text collections using latent dirichlet allocation. In: Boughanem M, Berrut C, Mothe J, Soule-Dupuy C, editors. Advances in information retrieval. Heidelberg: Springer; 2009. p. 776–80.CrossRef
21.
go back to reference AlSumait L, Barbará D, Domeniconi C. On-line lda: adaptive topic models for mining text streams with applications to topic detection and tracking. In: Data mining, 2008 ICDM ‘08 eighth IEEE international conference on: IEEE; 2008. p. 3–12. AlSumait L, Barbará D, Domeniconi C. On-line lda: adaptive topic models for mining text streams with applications to topic detection and tracking. In: Data mining, 2008 ICDM ‘08 eighth IEEE international conference on: IEEE; 2008. p. 3–12.
22.
go back to reference Wang X, McCallum A. Topics over time: a non-Markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. Philadelphia, PA, USA: ACM; 2006. p. 424–33. Wang X, McCallum A. Topics over time: a non-Markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. Philadelphia, PA, USA: ACM; 2006. p. 424–33.
23.
go back to reference Takahashi Y, Utsuro T, Yoshioka M, Kando N, Fukuhara T, Nakagawa H, et al. Applying a burst model to detect bursty topics in a topic model. In: Isahara H, Kanzaki K, editors. Advances in natural language processing. Heidelberg: Springer; 2012. p. 239–49.CrossRef Takahashi Y, Utsuro T, Yoshioka M, Kando N, Fukuhara T, Nakagawa H, et al. Applying a burst model to detect bursty topics in a topic model. In: Isahara H, Kanzaki K, editors. Advances in natural language processing. Heidelberg: Springer; 2012. p. 239–49.CrossRef
24.
go back to reference Zhang X, Wang T. Topic tracking with dynamic topic model and topic-based weighting method. J Softw. 2010;5:482–9. Zhang X, Wang T. Topic tracking with dynamic topic model and topic-based weighting method. J Softw. 2010;5:482–9.
25.
go back to reference Tang S, Zhang Y, Wang H, Chen M, Wu F, Zhuang Y. The discovery of burst topic and its intermittent evolution in our real world. China Commun. 2013;10:1–12.CrossRef Tang S, Zhang Y, Wang H, Chen M, Wu F, Zhuang Y. The discovery of burst topic and its intermittent evolution in our real world. China Commun. 2013;10:1–12.CrossRef
26.
go back to reference Liu S, Zhou MX, Pan S, Song Y, Qian W, Cai W, et al. Tiara: interactive, topic-based visual text summarization and analysis. ACM Trans Intell Syst Technol. 2012;3:25. Liu S, Zhou MX, Pan S, Song Y, Qian W, Cai W, et al. Tiara: interactive, topic-based visual text summarization and analysis. ACM Trans Intell Syst Technol. 2012;3:25.
27.
go back to reference Günnemann N, Derntl M, Klamma R, Jarke M. An interactive system for visual analytics of dynamic topic models. Datenbank-Spektrum. 2013;13:213–23.CrossRef Günnemann N, Derntl M, Klamma R, Jarke M. An interactive system for visual analytics of dynamic topic models. Datenbank-Spektrum. 2013;13:213–23.CrossRef
28.
go back to reference Havre S, Hetzler B, Nowell L. ThemeRiver: visualizing theme changes over time. In: IEEE symposium on information visualization 2000 INFOVIS 2000 Proceedings. Salt Lake City, UT: IEEE; 2000. p. 115–23. Havre S, Hetzler B, Nowell L. ThemeRiver: visualizing theme changes over time. In: IEEE symposium on information visualization 2000 INFOVIS 2000 Proceedings. Salt Lake City, UT: IEEE; 2000. p. 115–23.
29.
go back to reference Song C, editor. A set of Chinese formulae with ginseng. Beijing: China Medical Science Press; 2006. Song C, editor. A set of Chinese formulae with ginseng. Beijing: China Medical Science Press; 2006.
30.
go back to reference Clarke TC, Black LI, Stussman BJ, Barnes PM, Nahin RL. Trends in the use of complementary health approaches among adults: United States, 2002–2012. Natl Health Stat Rep. 2015;79:1–16. Clarke TC, Black LI, Stussman BJ, Barnes PM, Nahin RL. Trends in the use of complementary health approaches among adults: United States, 2002–2012. Natl Health Stat Rep. 2015;79:1–16.
31.
go back to reference Black LI, Clarke TC, Barnes PM, Stussman BJ, Nahin RL. Use of complementary health approaches among children aged 4–17 years in the United States: National Health Interview Survey, 2007–2012. Natl Health Stat Rep. 2015;78:1–19. Black LI, Clarke TC, Barnes PM, Stussman BJ, Nahin RL. Use of complementary health approaches among children aged 4–17 years in the United States: National Health Interview Survey, 2007–2012. Natl Health Stat Rep. 2015;78:1–19.
32.
go back to reference Garcia-Alvarez A, Egan B, de Klein S, Dima L, Maggi FM, Isoniemi M, et al. Usage of plant food supplements across six European countries: findings from the PlantLIBRA consumer survey. PLoS ONE. 2014;9:e92265.CrossRefPubMedPubMedCentral Garcia-Alvarez A, Egan B, de Klein S, Dima L, Maggi FM, Isoniemi M, et al. Usage of plant food supplements across six European countries: findings from the PlantLIBRA consumer survey. PLoS ONE. 2014;9:e92265.CrossRefPubMedPubMedCentral
34.
go back to reference Lee YS, Park H-S, Lee D-K, Jayakodi M, Kim N-H, Koo HJ, et al. Integrated transcriptomic and metabolomic analysis of five Panax ginseng cultivars reveals the dynamics of ginsenoside biosynthesis. Front Plant Sci. 2017;8:1048.CrossRefPubMedPubMedCentral Lee YS, Park H-S, Lee D-K, Jayakodi M, Kim N-H, Koo HJ, et al. Integrated transcriptomic and metabolomic analysis of five Panax ginseng cultivars reveals the dynamics of ginsenoside biosynthesis. Front Plant Sci. 2017;8:1048.CrossRefPubMedPubMedCentral
35.
go back to reference Lee M-H, Rhee Y-K, Choi S-Y, Cho C-W, Hong H-D, Kim K-T. Quality and characteristics of fermented ginseng seed oil based on bacterial strain and extraction method. J Ginseng Res. 2017;41:428–33.CrossRefPubMedPubMedCentral Lee M-H, Rhee Y-K, Choi S-Y, Cho C-W, Hong H-D, Kim K-T. Quality and characteristics of fermented ginseng seed oil based on bacterial strain and extraction method. J Ginseng Res. 2017;41:428–33.CrossRefPubMedPubMedCentral
36.
go back to reference Rider AK, Chawla NV. An ensemble topic model for sharing healthcare data and predicting disease risk. In: Proceedings of the international conference on bioinformatics, computational biology and biomedical informatics. Washington DC, USA: ACM; 2013. p. 333–40. Rider AK, Chawla NV. An ensemble topic model for sharing healthcare data and predicting disease risk. In: Proceedings of the international conference on bioinformatics, computational biology and biomedical informatics. Washington DC, USA: ACM; 2013. p. 333–40.
37.
go back to reference Bisgin H, Liu Z, Kelly R, Fang H, Xu X, Tong W. Investigating drug repositioning opportunities in FDA drug labels through topic modeling. BMC Bioinform. 2012;13:S6.CrossRef Bisgin H, Liu Z, Kelly R, Fang H, Xu X, Tong W. Investigating drug repositioning opportunities in FDA drug labels through topic modeling. BMC Bioinform. 2012;13:S6.CrossRef
38.
go back to reference Anonymous. NIH State-of-the-Science Conference Statement on management of menopause-related symptoms. NIH consensus and state-of-the-science statements; 2005. p. 1–38. Anonymous. NIH State-of-the-Science Conference Statement on management of menopause-related symptoms. NIH consensus and state-of-the-science statements; 2005. p. 1–38.
39.
go back to reference Barnes PM, Powell-Griner E, McFann K, Nahin RL. Complementary and alternative medicine use among adults: United States, 2002. In: Seminars in integrative medicine. Amsterdam: Elsevier; 2004. p. 54–71. Barnes PM, Powell-Griner E, McFann K, Nahin RL. Complementary and alternative medicine use among adults: United States, 2002. In: Seminars in integrative medicine. Amsterdam: Elsevier; 2004. p. 54–71.
40.
go back to reference Barnes P, Bloom B, Nahin R. Complementary and alternative medicine use among adults and children: united States. Natl Health Stat Rep. 2007;2008:1–23. Barnes P, Bloom B, Nahin R. Complementary and alternative medicine use among adults and children: united States. Natl Health Stat Rep. 2007;2008:1–23.
41.
go back to reference Mechanick JI, Brett EM, Chausmer AB, Dickey RA, Wallach S. American Association of Clinical Endocrinologists medical guidelines for the clinical use of dietary supplements and nutraceuticals. Endocr Pract. 2003;9:417–70.CrossRefPubMed Mechanick JI, Brett EM, Chausmer AB, Dickey RA, Wallach S. American Association of Clinical Endocrinologists medical guidelines for the clinical use of dietary supplements and nutraceuticals. Endocr Pract. 2003;9:417–70.CrossRefPubMed
42.
go back to reference Junfei M, Changhe L, Bohua Y. Impacts of sloping land conversion program on the vegetation in loess hilly and gully area of northern Shaanxi. Ecol Econ. 2009;5:160–7. Junfei M, Changhe L, Bohua Y. Impacts of sloping land conversion program on the vegetation in loess hilly and gully area of northern Shaanxi. Ecol Econ. 2009;5:160–7.
43.
go back to reference Yin R, Xu J, Li Z, Liu C. China’s ecological rehabilitation: the unprecedented efforts and dramatic impacts of reforestation and slope protection in western China. China Environ Ser. 2005;6:17–32. Yin R, Xu J, Li Z, Liu C. China’s ecological rehabilitation: the unprecedented efforts and dramatic impacts of reforestation and slope protection in western China. China Environ Ser. 2005;6:17–32.
44.
go back to reference Wu J, Basila D. Antihyperglycemic effects of total ginsenosides from leaves and stem of Panax ginseng. Acta Pharmacol Sin. 2005;26:1104–10.CrossRefPubMed Wu J, Basila D. Antihyperglycemic effects of total ginsenosides from leaves and stem of Panax ginseng. Acta Pharmacol Sin. 2005;26:1104–10.CrossRefPubMed
45.
go back to reference Shi W, Wang Y, Li J, Zhang H, Ding L. Investigation of ginsenosides in different parts and ages of Panax ginseng. Food Chem. 2007;102:664–8.CrossRef Shi W, Wang Y, Li J, Zhang H, Ding L. Investigation of ginsenosides in different parts and ages of Panax ginseng. Food Chem. 2007;102:664–8.CrossRef
46.
go back to reference CHEN C-B, LIU J-Y, WANG Y-Y, YAN S, XU S-Q. Allelopathy of Ginseng Rhizosphere and its effect on germination of seed. J Jilin Agric Univ. 2006;5:014. CHEN C-B, LIU J-Y, WANG Y-Y, YAN S, XU S-Q. Allelopathy of Ginseng Rhizosphere and its effect on germination of seed. J Jilin Agric Univ. 2006;5:014.
47.
go back to reference Bernards MA, Yousef LF, Nicol RW. The allelopathic potential of ginsenosides. Allelochemicals: biological control of plant pathogens and diseases. Berlin: Springer; 2006. p. 157–75.CrossRef Bernards MA, Yousef LF, Nicol RW. The allelopathic potential of ginsenosides. Allelochemicals: biological control of plant pathogens and diseases. Berlin: Springer; 2006. p. 157–75.CrossRef
48.
go back to reference Vendan RT, Yu YJ, Lee SH, Rhee YH. Diversity of endophytic bacteria in ginseng and their potential for plant growth promotion. J Microbiol. 2010;48:559–65.CrossRefPubMed Vendan RT, Yu YJ, Lee SH, Rhee YH. Diversity of endophytic bacteria in ginseng and their potential for plant growth promotion. J Microbiol. 2010;48:559–65.CrossRefPubMed
49.
go back to reference Cho KM, Hong SY, Lee SM, Kim YH, Kahng GG, Lim YP, et al. Endophytic bacterial communities in ginseng and their antifungal activity against pathogens. Microb Ecol. 2007;54:341–51.CrossRefPubMed Cho KM, Hong SY, Lee SM, Kim YH, Kahng GG, Lim YP, et al. Endophytic bacterial communities in ginseng and their antifungal activity against pathogens. Microb Ecol. 2007;54:341–51.CrossRefPubMed
Metadata
Title
Revealing topics and their evolution in biomedical literature using Bio-DTM: a case study of ginseng
Authors
Qian Chen
Ni Ai
Jie Liao
Xin Shao
Yufeng Liu
Xiaohui Fan
Publication date
01-12-2017
Publisher
BioMed Central
Published in
Chinese Medicine / Issue 1/2017
Electronic ISSN: 1749-8546
DOI
https://doi.org/10.1186/s13020-017-0148-7

Other articles of this Issue 1/2017

Chinese Medicine 1/2017 Go to the issue