Top

BMC Medical Informatics and Decision Making

Published in:

Open Access 01-12-2021 | Research article

Text classification models for the automatic detection of nonmedical prescription medication use from social media

Authors: Mohammed Ali Al-Garadi, Yuan-Chi Yang, Haitao Cai, Yucheng Ruan, Karen O’Connor, Gonzalez-Hernandez Graciela, Jeanmarie Perrone, Abeed Sarker

Published in: BMC Medical Informatics and Decision Making | Issue 1/2021

Abstract

Background

Prescription medication (PM) misuse/abuse has emerged as a national crisis in the United States, and social media has been suggested as a potential resource for performing active monitoring. However, automating a social media-based monitoring system is challenging—requiring advanced natural language processing (NLP) and machine learning methods. In this paper, we describe the development and evaluation of automatic text classification models for detecting self-reports of PM abuse from Twitter.

Methods

We experimented with state-of-the-art bi-directional transformer-based language models, which utilize tweet-level representations that enable transfer learning (e.g., BERT, RoBERTa, XLNet, AlBERT, and DistilBERT), proposed fusion-based approaches, and compared the developed models with several traditional machine learning, including deep learning, approaches. Using a public dataset, we evaluated the performances of the classifiers on their abilities to classify the non-majority “abuse/misuse” class.

Results

Our proposed fusion-based model performs significantly better than the best traditional model (F₁-score [95% CI]: 0.67 [0.64–0.69] vs. 0.45 [0.42–0.48]). We illustrate, via experimentation using varying training set sizes, that the transformer-based models are more stable and require less annotated data compared to the other models. The significant improvements achieved by our best-performing classification model over past approaches makes it suitable for automated continuous monitoring of nonmedical PM use from Twitter.

Conclusions

BERT, BERT-like and fusion-based models outperform traditional machine learning and deep learning models, achieving substantial improvements over many years of past research on the topic of prescription medication misuse/abuse classification from social media, which had been shown to be a complex task due to the unique ways in which information about nonmedical use is presented. Several challenges associated with the lack of context and the nature of social media language need to be overcome to further improve BERT and BERT-like models. These experimental driven challenges are represented as potential future research directions.

Available only for authorised users

National Institute on Drug Abuse. Misuse of Prescription Drugs. 2018 Dec.

Schepis TS. The prescription drug abuse epidemic : incidence, treatment, prevention, and policy. 1st ed. Praeger; 2018.

Hedegaard H, Miniño AM, Warner M. Drug Overdose Deaths in the United States, 1999–2018 Key findings Data from the National Vital Statistics System, Mortality. 2020 Jan.

Centers for Disease Control and Prevention. Wide-ranging online data for epidemiologic research (WONDER). 2020.

What States Need to Know about PDMPs | Drug Overdose | CDC Injury Center.

Manasco AT, Griggs C, Leeds R, Langlois BK, Breaud AH, Mitchell PM, et al. Characteristics of state prescription drug monitoring programs: a state-by-state survey. Pharmacoepidemiol Drug Saf. 2016;25(7):847–51.CrossRef

Finley EP, Garcia A, Rosen K, McGeary D, Pugh MJ, Potter JS. Evaluating the impact of prescription drug monitoring program implementation: A scoping review. Vol. 17, BMC Health Services Research. BioMed Central Ltd.; 2017.

Hanson CL, Cannon B, Burton S, Giraud-Carrier C. An exploration of social circles and prescription drug abuse through Twitter. J Med Internet Res. 2013 Jan;15(9):e189.CrossRef

Sarker A, DeRoos A, Perrone J. Mining social media for prescription medication abuse monitoring: a review and proposal for a data-centric framework. J Am Med Informatics Assoc. 2019;00:1–15.

10.

Osborne V, Striley CW, Nixon SJ, Winterstein AG, Cottler LB. Sex differences in patterns of prescription opioid non-medical use among 10–18 year olds in the US. Addict Behav. 2019 Feb;89:163–71.CrossRef

11.

Bigeard E, Grabar N, Thiessard F. Detection and Analysis of Drug Misuses. A Study Based on Social Media Messages. Front Pharmacol. 2018 Jul;9:791.

12.

Chary M, Genes N, Giraud-Carrier C, Hanson C, Nelson LS, Manini AF. Epidemiology from tweets: estimating misuse of prescription opioids in the USA from social media. J Med Toxicol. 2017 Dec;13(4):278–86.CrossRef

13.

Sarker A, Gonzalez-Hernandez G, Ruan Y, Perrone J. Machine learning and natural language processing for geolocation-centric monitoring and characterization of opioid-related social media chatter. JAMA Netw open. 2019 Nov;2(11):e1914672.CrossRef

14.

Chary M, Yi D, Manini AF. Candyflipping and other combinations: identifying drug-drug combinations from an online forum. Front Psychiatry. 2018 Apr;9:135.CrossRef

15.

Hanson CL, Burton SH, Giraud-Carrier C, West JH, Barnes MD, Hansen B. Tweaking and tweeting: exploring Twitter for nonmedical use of a psychostimulant drug (Adderall) among college students. J Med Internet Res. 2013 Apr;15(4):e62.CrossRef

16.

Sarker A, O’Connor K, Ginn R, Scotch M, Smith K, Malone D, et al. Social media mining for toxicovigilance: Automatic monitoring of prescription medication abuse from twitter. Drug Saf. 2016;39(3):231–40.CrossRef

17.

Harpaz R, Callahan A, Tamang S, Low Y, Odgers D, Finlayson S, et al. Text mining for adverse drug events: the promise, challenges, and state of the art. Drug Saf. 2014 Oct;37(10):777–90.CrossRef

18.

Paul MJ, Sarker A, Brownstein JS, Nikfarjam A, Scotch M, Smith KL, et al. Social media mining for public health monitoring and surveillance. Pacific Symp Biocomput. 2016;

19.

Jenhani F, Gouider MS. Said L Ben. A Hybrid Approach for Drug Abuse Events Extraction from Twitter. In: Procedia Computer Science; 2016.

20.

Chan B, Lopez A, Sarkar U. The canary in the coal mine tweets: social media reveals public perceptions of non-medical use of opioids. PLoS One. 2015 Aug 7;10(8).

21.

Shutler L, Nelson LS, Portelli I, Blachford C, Perrone J. Drug use in the Twittersphere: a qualitative contextual analysis of tweets about prescription drugs. J Addict Dis. 2015;

22.

Yang M, Kiang M, Shang W. Filtering big data from social media - Building an early warning system for adverse drug reactions. J Biomed Inform. 2015;

23.

Hu H, Phan NH, Chun SA, Geller J, Vo H, Ye X, et al. An insight analysis and detection of drug-abuse risk behavior on Twitter with self-taught deep learning. Comput Soc Networks [Internet]. 2019;6(1):1–19. https://doi.org/10.1186/s40649-019-0071-4

24.

Hu H, Moturu P, Dharan KN, Geller J, Di Iorio S, Phan H. Deep learning model for classifying drug abuse risk behavior in tweets. In: Proceedings - 2018 IEEE International Conference on Healthcare Informatics, ICHI 2018. 2018.

25.

Chancellor S, Nitzburg G, Hu A, Zampieri F, De Choudhury M. Discovering alternative treatments for opioid use recovery using social media. In: Conference on Human Factors in Computing Systems - Proceedings. 2019.

26.

Mozafari M, Farahbakhsh R, Crespi N. A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media. 2019;1–12. Available from: http://arxiv.org/abs/1910.12574

27.

Mozafari M, Farahbakhsh R, Crespi N. Hate speech detection and racial bias mitigation in social media based on BERT model. PLoS One. 2020;

28.

Wang T, Lu K, Chow KP, Zhu Q. COVID-19 Sensing: Negative Sentiment Analysis on Social Media in China via BERT Model. IEEE Access. 2020;

29.

Abdul-Mageed M, Zhang C, Rajendran A, Elmadany AR, Przystupa M, Ungar L. Sentence-level BERT and multi-task learning of age and gender in social media. arXiv. 2019.

30.

Devlin J, Chang M-W, Lee K, Google KT, Language AI. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding [Internet]. [cited 2020 Jan 16]. Available from: https://github.com/tensorflow/tensor2tensor

31.

Alsentzer E, Murphy JR, Boag W, Weng W-H, Jin D, Naumann T, et al. Publicly Available Clinical BERT Embeddings [Internet]. [cited 2019 Dec 11]. Available from: https://www.ncbi.nlm.nih.gov/pmc/

32.

Mikolov T, Chen K, Corrado G, Dean J. Distributed Representations of Words and Phrases and their Compositionality. Nips. 2013;1–9.

33.

O’Connor K, Sarker A, Perrone J, Gonzalez HG. Promoting reproducible research for characterizing nonmedical use of medications through data annotation: description of a Twitter corpus and guidelines. J Med Internet Res. 2020 Feb;22(2):e15861.CrossRef

34.

Sarker A, Gonzalez-Hernandez G. An unsupervised and customizable misspelling generator for mining noisy health-related text sources. J Biomed Inform. 2018;88.

35.

Fernández-Delgado M, Cernadas E, Barro S, Amorim D, Amorim F-D. Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res. 2014;15:3133–81.

36.

Platt J, others. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv large margin Classif. 1999;

37.

Chang C-C, Lin C-J. LIBSVM: a library for support vector machines. Taipei; 2019 Nov.

38.

Kiefer J, Wolfowitz J. Stochastic estimation of the maximum of a regression function. Ann Math Stat. 1952

39.

Statistics LB, Statistics LB, Breiman L. Random forests. Mach Learn. 2001;45:5–32.CrossRef

40.

Rish I. An empirical study of the naive Bayes classifier. IJCAI 2001 Work Empir methods Artif Intell. 2001

41.

Cover TM, Hart PE. Nearest neighbor pattern classification. IEEE Trans Inf Theory. 1967;13(1):21–7.CrossRef

42.

Sarker A. Gonzalez G. A corpus for mining drug-related knowledge from Twitter chatter: Language models and their utilities. Data Br; 2017. p. 10.

43.

Conneau A, Schwenk H, Le Cun Y, Lo¨ıc Barrault L. Very Deep Convolutional Networks for Text Classification. Vol. 1, the Association for Computational Linguistics. 2017.

44.

Jacovi A, Shalom OS, Goldberg Y. Understanding convolutional neural networks for text classification. arXiv. 2018.

45.

Pennington J, Socher R. Manning CD. Glove: Global Vectors for Word Representation; 2014. p. 1532–43.

46.

Zhang X, Zhao J, Lecun Y. Character-level Convolutional Networks for Text Classification *.

47.

Liu P, Qiu X, Huang X. Recurrent Neural Network for Text Classification with Multi-Task Learning.

48.

Sutskever I, Martens J, Hinton G. Generating Text with Recurrent Neural Networks. In: 28 th International Conference on Machine Learning. Bellevue; 2011.

49.

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. In: Advances in Neural Information Processing Systems. 2017.

50.

Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach. 2019 Jul;

51.

Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. 2019 Sep;

52.

Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov R, Le Q V. XLNet: Generalized Autoregressive Pretraining for Language Understanding. 2019 Jun;

53.

Sanh V, Debut L, Chaumond J, Wolf T. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. 2019 Oct;

54.

Efron B. Bootstrap Methods: Another Look at the Jackknife. Vol. 7, The Annals of Statistics. Institute of Mathematical Statistics; p. 1–26.

55.

Sagi O, Rokach L. Ensemble learning: A survey. Wiley Interdiscip Rev Data Min Knowl Discov. 2018 Jul;8(4).

56.

Ettinger A. What BERT Is Not: Lessons from a New Suite of Psycholinguistic Diagnostics for Language Models. Trans Assoc Comput Linguist. 2020 Jan;8:34–48.CrossRef

57.

Sarker A, Belousov M, Friedrichs J, Hakala K, Kiritchenko S, Mehryary F, et al. Data and systems for medication-related text classification and concept normalization from Twitter: insights from the Social Media Mining for Health (SMM4H)-2017 shared task. J Am Med Informatics Assoc. 2018 Oct;25(10):1274–83.CrossRef

Title: Text classification models for the automatic detection of nonmedical prescription medication use from social media
Authors: Mohammed Ali Al-Garadi
Yuan-Chi Yang
Haitao Cai
Yucheng Ruan
Karen O’Connor
Gonzalez-Hernandez Graciela
Jeanmarie Perrone
Abeed Sarker
Publication date: 01-12-2021
Publisher: BioMed Central
Published in: BMC Medical Informatics and Decision Making / Issue 1/2021
Electronic ISSN: 1472-6947
DOI: https://doi.org/10.1186/s12911-021-01394-0

At a glance: The STEP trials

Springer Medicine

Text classification models for the automatic detection of nonmedical prescription medication use from social media

Abstract

Background

Methods

Results

Conclusions

At a glance: The STEP trials

Springer Medicine

Abstract

Background

Methods

Results

Conclusions

Please log in to get access to this content

Other articles of this Issue 1/2021

Assessing the suitability of general practice electronic health records for clinical prediction model development: a data quality assessment

Making sense of DialysisConnect: a qualitative analysis of stakeholder viewpoints on a web-based information exchange platform to improve care transitions between dialysis clinics and hospitals

The future of medical scribes documenting in the electronic health record: results of an expert consensus conference

Applying artificial neural network for early detection of sepsis with intentionally preserved highly missing real-world data for simulating clinical situation

Classifying patient and professional voice in social media health posts

Transformer-based deep neural network language models for Alzheimer’s disease risk assessment from targeted speech