MedFact: Towards Improving Veracity of Medical Information in Social Media Using Applied Machine Learning

Samuel, Hamman; Zaïane, Osmar

doi:10.1007/978-3-319-89656-4_9

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10832))

Included in the following conference series:

Canadian Conference on Artificial Intelligence

3227 Accesses
17 Citations

Abstract

Since the advent of Web 2.0 and social media, anyone with an Internet connection can create content online, even if it is uncertain or fake information, which has attracted significant attention recently. In this study, we address the challenge of uncertain online health information by automating systematic approaches borrowed from evidence-based medicine. Our proposed algorithm, MedFact, enables recommendation of trusted medical information within health-related social media discussions and empowers online users to make informed decisions about the credibility of online health information. MedFact automatically extracts relevant keywords from online discussions and queries trusted medical literature with the aim of embedding related factual information into the discussion. Our retrieval model takes into account layperson terminology and hierarchy of evidence. Consequently, MedFact is a departure from current consensus-based approaches for determining credibility using “wisdom of the crowd”, binary “Like” votes and ratings, popular in social media. Moving away from subjective metrics, MedFact introduces objective metrics. We also present preliminary work towards a granular veracity score by using supervised machine learning to compare statements within uncertain social media text and trusted medical text. We evaluate our proposed algorithm on various data sets from existing health social media involving both patient and medic discussions, with promising results and suggestions for ongoing improvements and future research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

MedSeer: A Medical Controversial Information Retrieval System Based on Credible Sources

Improving medical experts’ efficiency of misinformation detection: an exploratory study

Article Open access 12 August 2022

Beyond belief: a cross-genre study on perception and validation of health information online

Article 02 February 2022

Notes

1.
The GenSim Python API includes the TextRank algorithm [21] implementation
https://radimrehurek.com/gensim/summarization/keywords.html.
2.
SNOMED CT data set available from U.S. National Library of Medicine (NLM)
https://nlm.nih.gov/healthit/snomedct.
3.
CHV data set available from the Consumer Health Vocabulary Initiative
http://consumerhealthvocab.org.
4.
SEW historical data set available via PIKES home page
http://pikes.fbk.eu/eval-sew.html.
5.
The TRIP database is accessible programmatically via web services that were most kindly made available to the authors by Jon Brassey, the TRIP database creator
https://tripdatabase.com/addtrip.
6.
POS tagging is done using the Penn Treebank tags set, all steps in this particular pipeline are programmed with the NLTK Python library http://nltk.org.
7.
Sentiment analysis is performed using the TextBlob Python library
http://textblob.readthedocs.io.
8.
The spaCy Python library is used for generating dependency trees https://spacy.io.
9.
We implement a shallow CNN with the ConText tool
https://github.com/riejohnson/ConText.
10.
Health Stack Exchange’s beta web site https://health.stackexchange.com.
11.
Data set curated from the Stack Exchange Data Dump from the Internet Archive
https://archive.org/details/stackexchange.
12.
QuackWatch web site http://quackwatch.org.
13.
DocCheck web site http://doccheck.com.

References

Kata, A.: Anti-vaccine activists, web 2.0, and the postmodern paradigm-an overview of tactics and tropes used online by the anti-vaccination movement. Vaccine 30(25), 3778–3789 (2012)
Article Google Scholar
Rippen, H., Risk, A.: e-Health code of ethics (May 24). J. Med. Internet Res. 2(2) (2000)
Google Scholar
Greenhalgh, T.: How to Read a Paper: The Basics of Evidence-Based Medicine. Wiley, Chichester (2010)
Google Scholar
Ackley, B.J.: Evidence-Based Nursing Care Guidelines: Medical-Surgical Interventions. Elsevier Health Sciences, St. Louis (2008)
Google Scholar
Child, J.: Trust-the fundamental bond in global collaboration. Organ. Dyn. 29(4), 274–288 (2001)
Article Google Scholar
Varlamis, I., Eirinaki, M., Louta, M.: A study on social network metrics and their application in trust networks. In: Proceedings of the IEEE International Conference on Advances in Social Networks Analysis and Mining, pp. 168–175 (2010)
Google Scholar
Abdaoui, A., Azé, J., Bringay, S., Poncelet, P.: Collaborative content-based method for estimating user reputation in online forums. In: Wang, J., Cellary, W., Wang, D., Wang, H., Chen, S.-C., Li, T., Zhang, Y. (eds.) WISE 2015. LNCS, vol. 9419, pp. 292–299. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26187-4_26
Chapter Google Scholar
Grant, S., Betts, B.: Encouraging user behaviour with achievements: an empirical study. In: IEEE International Working Conference on Mining Software Repositories (MSR), pp. 65–68 (2013)
Google Scholar
Aljazzaf, Z.M.: Trust-Based Service Selection. Ph.D. thesis. University of Western Ontario (2011)
Google Scholar
Park, M.: HealthTrust: Assessing the Trustworthiness of Healthcare Information on the Internet. Ph.D. thesis. University of Kansas (2013)
Google Scholar
Aphinyanaphongs, Y., Aliferis, C., et al.: Text categorization models for identifying unproven cancer treatments on the web. In: World Congress on Medical Informatics (MedInfo), p. 968. IOS Press (2007)
Google Scholar
Oliphant, T.: “I am making my decision on the basis of my experience”: constructing authoritative knowledge about treatments for depression. Can. J. Inf. Libr. Sci. 33(3–4), 215–232 (2009)
Google Scholar
Stephens, G.J., Silbert, L.J., Hasson, U.: Speaker-listener neural coupling underlies successful communication. Proc. Natl. Acad. Sci. 107(32), 14425–14430 (2010)
Article Google Scholar
Nyhan, B., Reifler, J., Richey, S., Freed, G.L.: Effective messages in vaccine promotion: a randomized trial. Pediatrics 133(4) (2014)
Google Scholar
Nyhan, B., Reifler, J.: When corrections fail: the persistence of political misperceptions. Polit. Behav. 32(2), 303–330 (2010)
Article Google Scholar
Plous, S.: The Psychology of Judgment and Decision Making. McGraw-Hill, New York (1993)
Google Scholar
Dunning, D.: The dunning-kruger effect: on being ignorant of one’s own ignorance. Adv. Exp. Soc. Psychol. 44, 247 (2011)
Article Google Scholar
Proctor, R., Schiebinger, L.L.: Agnotology: The Making and Unmaking of Ignorance. Stanford University Press, Stanford (2008)
Google Scholar
Henderson, J.: Expert and lay knowledge: a sociological perspective. Nutr. Diet. 67(1), 4–5 (2010)
Article Google Scholar
Straus, S.E., Richardson, S.W., Glasziou, P., Haynes, B.R.: Evidence-Based Medicine: How to Practice and Teach EBM. Elsevier/Churchill Livingstone, New York (2005)
Google Scholar
Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: EMNLP, vol. 4, pp. 404–411 (2004)
Google Scholar
Cornet, R., de Keizer, N.: Forty years of SNOMED: a literature review. BMC Med. Inform. Decis. Mak. 8(1), S2 (2008)
Article Google Scholar
Smith, C., Stavri, P.: Consumer health vocabulary. In: Consumer Health Informatics, pp. 122–128 (2005)
Google Scholar
Corcoglioniti, F., Rospocher, M., Aprosio, A.P.: Extracting knowledge from text with PIKES. In: International Semantic Web Conference (2015)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient Estimation of Word Representations in Vector Space. arXiv (2013)
Google Scholar
Brassey, J.: TRIP database: identifying high quality medical literature from a range of sources. New Rev. Inf. Netw. 11(2), 229–234 (2005)
Article Google Scholar
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Book MATH Google Scholar
Pang, B., Lee, L., et al.: Opinion mining and sentiment analysis. Found. Trends\({\textregistered }\) Inf. Retr. 2(1–2), 1–135 (2008)
Google Scholar
De Marneffe, M.C., Manning, C.D.: Stanford Typed Dependencies Manual. Technical report, Stanford University (2008)
Google Scholar
Johnson, R., Zhang, T.: Effective use of word order for text categorization with convolutional neural networks. In: North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HLT) (2015)
Google Scholar
Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33(1), 159–174 (1977)
Article MATH Google Scholar

Download references

Acknowledgement

We thank the Alberta Machine Intelligence Institute (Amii) for funding this research.

Author information

Authors and Affiliations

Department of Computing Science, University of Alberta, Edmonton, Canada
Hamman Samuel & Osmar Zaïane

Authors

Hamman Samuel
View author publications
You can also search for this author in PubMed Google Scholar
Osmar Zaïane
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hamman Samuel .

Editor information

Editors and Affiliations

Ryerson University, Toronto, Ontario, Canada
Ebrahim Bagheri
McGill University, Montréal, Québec, Canada
Jackie C.K. Cheung

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Samuel, H., Zaïane, O. (2018). MedFact: Towards Improving Veracity of Medical Information in Social Media Using Applied Machine Learning. In: Bagheri, E., Cheung, J. (eds) Advances in Artificial Intelligence. Canadian AI 2018. Lecture Notes in Computer Science(), vol 10832. Springer, Cham. https://doi.org/10.1007/978-3-319-89656-4_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-89656-4_9
Published: 06 April 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-89655-7
Online ISBN: 978-3-319-89656-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

MedFact: Towards Improving Veracity of Medical Information in Social Media Using Applied Machine Learning

Abstract

Access this chapter

Similar content being viewed by others

MedSeer: A Medical Controversial Information Retrieval System Based on Credible Sources

Improving medical experts’ efficiency of misinformation detection: an exploratory study

Beyond belief: a cross-genre study on perception and validation of health information online

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

MedFact: Towards Improving Veracity of Medical Information in Social Media Using Applied Machine Learning

Abstract

Access this chapter

Similar content being viewed by others

MedSeer: A Medical Controversial Information Retrieval System Based on Credible Sources

Improving medical experts’ efficiency of misinformation detection: an exploratory study

Beyond belief: a cross-genre study on perception and validation of health information online

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation