Skip to main content
Top
Published in: Drug Safety 10/2014

01-10-2014 | Leading Article

Text Mining for Adverse Drug Events: the Promise, Challenges, and State of the Art

Authors: Rave Harpaz, Alison Callahan, Suzanne Tamang, Yen Low, David Odgers, Sam Finlayson, Kenneth Jung, Paea LePendu, Nigam H. Shah

Published in: Drug Safety | Issue 10/2014

Login to get access

Abstract

Text mining is the computational process of extracting meaningful information from large amounts of unstructured text. It is emerging as a tool to leverage underutilized data sources that can improve pharmacovigilance, including the objective of adverse drug event (ADE) detection and assessment. This article provides an overview of recent advances in pharmacovigilance driven by the application of text mining, and discusses several data sources—such as biomedical literature, clinical narratives, product labeling, social media, and Web search logs—that are amenable to text mining for pharmacovigilance. Given the state of the art, it appears text mining can be applied to extract useful ADE-related information from multiple textual sources. Nonetheless, further research is required to address remaining technical challenges associated with the text mining methodologies, and to conclusively determine the relative contribution of each textual source to improving pharmacovigilance.
Literature
1.
go back to reference Kroeze JH, Matthee MC, Bothma TJD. Differentiating data- and text-mining terminology. In: Proceedings of the 2003 Annual Research Conference of the South African Institute of Computer Scientists and Information Technologists on Enablement Through Technology. 954024: South African Institute for Computer Scientists and Information Technologists; 2003: pp. 93–101. Kroeze JH, Matthee MC, Bothma TJD. Differentiating data- and text-mining terminology. In: Proceedings of the 2003 Annual Research Conference of the South African Institute of Computer Scientists and Information Technologists on Enablement Through Technology. 954024: South African Institute for Computer Scientists and Information Technologists; 2003: pp. 93–101.
2.
go back to reference Witten IH. “Text mining”. In: Singh MP, editor. Practical handbook of internet computing. Boca Raton, FL: Chapman and Hall/CRC Press; 2005: pp. 14-1–22. Witten IH. “Text mining”. In: Singh MP, editor. Practical handbook of internet computing. Boca Raton, FL: Chapman and Hall/CRC Press; 2005: pp. 14-1–22.
3.
go back to reference Szarfman A, Machado SG, O’Neill RT. Use of screening algorithms and computer systems to efficiently signal higher-than-expected combinations of drugs and events in the US FDA’s spontaneous reports database. Drug Saf. 2002;25(6):381–92.CrossRefPubMed Szarfman A, Machado SG, O’Neill RT. Use of screening algorithms and computer systems to efficiently signal higher-than-expected combinations of drugs and events in the US FDA’s spontaneous reports database. Drug Saf. 2002;25(6):381–92.CrossRefPubMed
4.
go back to reference Harpaz R, Dumouchel W, Lependu P, Bauer-Mehren A, Ryan P, Shah NH. Performance of pharmacovigilance signal-detection algorithms for the FDA adverse event reporting system. Clin Pharmacol Ther. 2013;93(6):539–46. doi:10.1038/clpt.2013.24.CrossRefPubMed Harpaz R, Dumouchel W, Lependu P, Bauer-Mehren A, Ryan P, Shah NH. Performance of pharmacovigilance signal-detection algorithms for the FDA adverse event reporting system. Clin Pharmacol Ther. 2013;93(6):539–46. doi:10.​1038/​clpt.​2013.​24.CrossRefPubMed
11.
go back to reference Platt R, Wilson M, Chan KA, Benner JS, Marchibroda J, McClellan M. The new sentinel network: improving the evidence of medical-product safety. N Engl J Med. 2009;361(7):645–7.CrossRefPubMed Platt R, Wilson M, Chan KA, Benner JS, Marchibroda J, McClellan M. The new sentinel network: improving the evidence of medical-product safety. N Engl J Med. 2009;361(7):645–7.CrossRefPubMed
12.
go back to reference Stang PE, Ryan PB, Racoosin JA, Overhage JM, Hartzema AG, Reich C, et al. Advancing the science for active surveillance: rationale and design for the observational medical outcomes partnership. Annal Intern Med. 2010;153(9):600–6.CrossRef Stang PE, Ryan PB, Racoosin JA, Overhage JM, Hartzema AG, Reich C, et al. Advancing the science for active surveillance: rationale and design for the observational medical outcomes partnership. Annal Intern Med. 2010;153(9):600–6.CrossRef
13.
go back to reference Coloma PM, Schuemie MJ, Trifiro G, Gini R, Herings R, Hippisley-Cox J, et al. Combining electronic healthcare databases in Europe to allow for large-scale drug safety monitoring: the EU-ADR Project. Pharmacoepidemiol Drug Saf. 2011;20(1):1–11.CrossRefPubMed Coloma PM, Schuemie MJ, Trifiro G, Gini R, Herings R, Hippisley-Cox J, et al. Combining electronic healthcare databases in Europe to allow for large-scale drug safety monitoring: the EU-ADR Project. Pharmacoepidemiol Drug Saf. 2011;20(1):1–11.CrossRefPubMed
16.
go back to reference Boyce RD, Ryan PB, Noren GN, et al. Bridging islands of information to establish an integrated knowledge base of drugs and health outcomes of interest. Drug Saf. 2014;2014(07/02):1–11. Boyce RD, Ryan PB, Noren GN, et al. Bridging islands of information to establish an integrated knowledge base of drugs and health outcomes of interest. Drug Saf. 2014;2014(07/02):1–11.
17.
go back to reference Duke JD, Friedlin J. ADESSA: a real-time decision support service for delivery of semantically coded adverse drug event data. AMIA Annu Symp Proc. 2010;2010:177–81.PubMedCentralPubMed Duke JD, Friedlin J. ADESSA: a real-time decision support service for delivery of semantically coded adverse drug event data. AMIA Annu Symp Proc. 2010;2010:177–81.PubMedCentralPubMed
20.
go back to reference Friedman C, Elhadad N. Natural language processing in health care and biomedicine. In: Shortliffe EH, Cimino JJ, editors. Biomedical informatics. London: Springer; 2014. p. 255–84.CrossRef Friedman C, Elhadad N. Natural language processing in health care and biomedicine. In: Shortliffe EH, Cimino JJ, editors. Biomedical informatics. London: Springer; 2014. p. 255–84.CrossRef
22.
go back to reference Lindberg DA, Humphreys BL, McCray AT. The unified medical language system. Methods Inf Med. 1993;32(4):281–91.PubMed Lindberg DA, Humphreys BL, McCray AT. The unified medical language system. Methods Inf Med. 1993;32(4):281–91.PubMed
25.
go back to reference Gurulingappa H, Klinger R, Hofmann-Apitius M, Fluck J, editors. An empirical evaluation of resources for the identification of diseases and adverse effects in biomedical literature. 2nd Workshop on Building and Evaluating Resources for Biomedical Text Mining (7th edition of the Language Resources and Evaluation Conference); 2010. Gurulingappa H, Klinger R, Hofmann-Apitius M, Fluck J, editors. An empirical evaluation of resources for the identification of diseases and adverse effects in biomedical literature. 2nd Workshop on Building and Evaluating Resources for Biomedical Text Mining (7th edition of the Language Resources and Evaluation Conference); 2010.
27.
go back to reference Xu R, Musen MA, Shah NH. A comprehensive analysis of five million UMLS Metathesaurus terms using eighteen million MEDLINE citations. AMIA Annu Symp Proc. 2010;2010:907–11.PubMedCentralPubMed Xu R, Musen MA, Shah NH. A comprehensive analysis of five million UMLS Metathesaurus terms using eighteen million MEDLINE citations. AMIA Annu Symp Proc. 2010;2010:907–11.PubMedCentralPubMed
31.
go back to reference Coulet A, Garten Y, Dumontier M, Altman RB, Musen MA, Shah NH. Integration and publication of heterogeneous text-mined relationships on the Semantic Web. J Biomed Semant. 2011;2(Suppl 2):S10. doi:10.1186/2041-1480-2-S2-S10.CrossRef Coulet A, Garten Y, Dumontier M, Altman RB, Musen MA, Shah NH. Integration and publication of heterogeneous text-mined relationships on the Semantic Web. J Biomed Semant. 2011;2(Suppl 2):S10. doi:10.​1186/​2041-1480-2-S2-S10.CrossRef
32.
go back to reference Percha B, Garten Y, Altman RB. Discovery and explanation of drug–drug interactions via text mining. Pac Symp Biocomput; 2012; 410–21. Percha B, Garten Y, Altman RB. Discovery and explanation of drug–drug interactions via text mining. Pac Symp Biocomput; 2012; 410–21.
34.
go back to reference Jonquet C, Shah NH, Musen MA. The open biomedical annotator. Summit Transl Bioinform. 2009;2009:56–60. Jonquet C, Shah NH, Musen MA. The open biomedical annotator. Summit Transl Bioinform. 2009;2009:56–60.
39.
go back to reference Coloma PM, Avillach P, Salvo F, Schuemie MJ, Ferrajolo C, Pariente A, et al. A reference standard for evaluation of methods for drug safety signal detection using electronic healthcare record databases. Drug Saf. 2013;36(1):13–23. doi:10.1007/s40264-012-0002-x.CrossRefPubMed Coloma PM, Avillach P, Salvo F, Schuemie MJ, Ferrajolo C, Pariente A, et al. A reference standard for evaluation of methods for drug safety signal detection using electronic healthcare record databases. Drug Saf. 2013;36(1):13–23. doi:10.​1007/​s40264-012-0002-x.CrossRefPubMed
40.
go back to reference Gurulingappa H, Toldo L, Rajput AM, Kors JA, Taweel A, Tayrouz Y. Automatic detection of adverse events to predict drug label changes using text and data mining techniques. Pharmacoepidemiol Drug Saf. 2013;22(11):1189–94. doi:10.1002/pds.3493.CrossRefPubMed Gurulingappa H, Toldo L, Rajput AM, Kors JA, Taweel A, Tayrouz Y. Automatic detection of adverse events to predict drug label changes using text and data mining techniques. Pharmacoepidemiol Drug Saf. 2013;22(11):1189–94. doi:10.​1002/​pds.​3493.CrossRefPubMed
41.
go back to reference Gurulingappa H, Rajput AM, Roberts A, Fluck J, Hofmann-Apitius M, Toldo L. Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports. J Biomed Inform. 2012;45(5):885–92. doi:10.1016/j.jbi.2012.04.008.CrossRefPubMed Gurulingappa H, Rajput AM, Roberts A, Fluck J, Hofmann-Apitius M, Toldo L. Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports. J Biomed Inform. 2012;45(5):885–92. doi:10.​1016/​j.​jbi.​2012.​04.​008.CrossRefPubMed
42.
go back to reference Xu R, Wang Q. Large-scale combining signals from both biomedical literature and the FDA Adverse Event Reporting System (FAERS) to improve post-marketing drug safety signal detection. BMC Bioinform. 2014;15(1):17. doi:10.1186/1471-2105-15-17.CrossRef Xu R, Wang Q. Large-scale combining signals from both biomedical literature and the FDA Adverse Event Reporting System (FAERS) to improve post-marketing drug safety signal detection. BMC Bioinform. 2014;15(1):17. doi:10.​1186/​1471-2105-15-17.CrossRef
46.
go back to reference Wang W, Haerian K, Salmasian H, Harpaz R, Chase HS, Friedman C. A drug-adverse event extraction algorithm to support pharmacovigilance knowledge mining from PubMed citations. AMIA Annu Symp Proc. 2011; 2011:1464–70. Wang W, Haerian K, Salmasian H, Harpaz R, Chase HS, Friedman C. A drug-adverse event extraction algorithm to support pharmacovigilance knowledge mining from PubMed citations. AMIA Annu Symp Proc. 2011; 2011:1464–70.
54.
go back to reference Classen DC, Resar R, Griffin F, Federico F, Frankel T, Kimmel N, et al. ‘Global Trigger Tool’ shows that adverse events in hospitals may be ten times greater than previously measured. Health Aff. 2011;30(4):581–9. doi:10.1377/hlthaff.2011.0190.CrossRef Classen DC, Resar R, Griffin F, Federico F, Frankel T, Kimmel N, et al. ‘Global Trigger Tool’ shows that adverse events in hospitals may be ten times greater than previously measured. Health Aff. 2011;30(4):581–9. doi:10.​1377/​hlthaff.​2011.​0190.CrossRef
59.
go back to reference Li Y, Salmasian H, Vilar S, Chase H, Friedman C, Wei Y. A method for controlling complex confounding effects in the detection of adverse drug reactions using electronic health records. J Am Med Inform Assoc. 2014;21(2):308–14. doi:10.1136/amiajnl-2013-001718.CrossRefPubMed Li Y, Salmasian H, Vilar S, Chase H, Friedman C, Wei Y. A method for controlling complex confounding effects in the detection of adverse drug reactions using electronic health records. J Am Med Inform Assoc. 2014;21(2):308–14. doi:10.​1136/​amiajnl-2013-001718.CrossRefPubMed
60.
go back to reference Harpaz R, Haerian K, Chase HS, Friedman C. Mining electronic health records for adverse drug effects using regression based methods. In: Proceedings of the 1st ACM International Health Informatics Symposium; Arlington, VA. 1883008: ACM; 2010: pp. 100–7. Harpaz R, Haerian K, Chase HS, Friedman C. Mining electronic health records for adverse drug effects using regression based methods. In: Proceedings of the 1st ACM International Health Informatics Symposium; Arlington, VA. 1883008: ACM; 2010: pp. 100–7.
62.
go back to reference Lowe HJ, Ferris TA, Hernandez PM, Weber SC. STRIDE—an integrated standards-based translational research informatics platform. AMIA Annu Symp Proc. 2009;2009:391–5.PubMedCentralPubMed Lowe HJ, Ferris TA, Hernandez PM, Weber SC. STRIDE—an integrated standards-based translational research informatics platform. AMIA Annu Symp Proc. 2009;2009:391–5.PubMedCentralPubMed
63.
65.
go back to reference Harpaz R, DuMouchel W, LePendu P, Shah NH. Empirical Bayes model to combine signals of adverse drug reactions. Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '13), pp. 1339–1347. Harpaz R, DuMouchel W, LePendu P, Shah NH. Empirical Bayes model to combine signals of adverse drug reactions. Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '13), pp. 1339–1347.
70.
go back to reference Medawar C, Herxheimer A, Bell A, Jofre S. Paroxetine, panorama and user reporting of ADRs: consumer intelligence matters in clinical practice and post-marketing drug surveillance. Int J Risk Saf Med. 2002;15(3):161–9. Medawar C, Herxheimer A, Bell A, Jofre S. Paroxetine, panorama and user reporting of ADRs: consumer intelligence matters in clinical practice and post-marketing drug surveillance. Int J Risk Saf Med. 2002;15(3):161–9.
74.
go back to reference Leaman R, Wojtulewicz L, Sullivan R, Skariah A, Yang J, Gonzalez G. Towards internet-age pharmacovigilance: extracting adverse drug reactions from user posts in health-related social networks. In: Proceedings of the 2010 Workshop on Biomedical Natural Language Processing. 2010: pp: 117–25. Leaman R, Wojtulewicz L, Sullivan R, Skariah A, Yang J, Gonzalez G. Towards internet-age pharmacovigilance: extracting adverse drug reactions from user posts in health-related social networks. In: Proceedings of the 2010 Workshop on Biomedical Natural Language Processing. 2010: pp: 117–25.
75.
go back to reference Yang CC, Yang H, Jiang L, Zhang M. Social media mining for drug safety signal detection. In: Proceedings of the 2012 International Workshop on Smart Health and Wellbeing; Maui, HI. 2389714: ACM; 2012. p. 33–40. Yang CC, Yang H, Jiang L, Zhang M. Social media mining for drug safety signal detection. In: Proceedings of the 2012 International Workshop on Smart Health and Wellbeing; Maui, HI. 2389714: ACM; 2012. p. 33–40.
77.
go back to reference Liu X, Chen H. AZDrugMiner: an information extraction system for mining patient-reported adverse drug events in online patient forums. In: Zeng D, Yang C, Tseng V, Xing C, Chen H, Wang F-Y, et al., editors. Smart Health. Lecture notes in computer science. Springer: Berlin Heidelberg; 2013. p. 134–50. Liu X, Chen H. AZDrugMiner: an information extraction system for mining patient-reported adverse drug events in online patient forums. In: Zeng D, Yang C, Tseng V, Xing C, Chen H, Wang F-Y, et al., editors. Smart Health. Lecture notes in computer science. Springer: Berlin Heidelberg; 2013. p. 134–50.
78.
go back to reference Nikfarjam A, Gonzalez GH. Pattern mining for extraction of mentions of adverse drug reactions from user comments. AMIA Annu Symp Proc. 2011;2011:1019–26.PubMedCentralPubMed Nikfarjam A, Gonzalez GH. Pattern mining for extraction of mentions of adverse drug reactions from user comments. AMIA Annu Symp Proc. 2011;2011:1019–26.PubMedCentralPubMed
79.
go back to reference Chee BW, Berlin R, Schatz B. Predicting adverse drug events from personal health messages. AMIA Annu Symp Proc. 2011;2011:217–26.PubMedCentralPubMed Chee BW, Berlin R, Schatz B. Predicting adverse drug events from personal health messages. AMIA Annu Symp Proc. 2011;2011:217–26.PubMedCentralPubMed
80.
go back to reference Liu J, Li A, Seneff S. Automatic drug side effect discovery from online patient-submitted reviews: focus on statin drugs. The First International Conference on advances in information mining and management. 2011. Liu J, Li A, Seneff S. Automatic drug side effect discovery from online patient-submitted reviews: focus on statin drugs. The First International Conference on advances in information mining and management. 2011.
81.
go back to reference Hadzi-Puric J, Grmusa J, editors. Automatic drug adverse reaction discovery from parenting websites using disproportionality methods. Advances in Social Networks Analysis and Mining (ASONAM), 2012 IEEE/ACM International Conference on; 26–29 Aug 2012. Hadzi-Puric J, Grmusa J, editors. Automatic drug adverse reaction discovery from parenting websites using disproportionality methods. Advances in Social Networks Analysis and Mining (ASONAM), 2012 IEEE/ACM International Conference on; 26–29 Aug 2012.
82.
go back to reference Benton A, Ungar L, Hill S, Hennessy S, Mao J, Chung A, et al. Identifying potential adverse effects using the web: a new approach to medical hypothesis generation. J Biomed Inform. 2011;44(6):989–96. doi:10.1016/j.jbi.2011.07.005.PubMed Benton A, Ungar L, Hill S, Hennessy S, Mao J, Chung A, et al. Identifying potential adverse effects using the web: a new approach to medical hypothesis generation. J Biomed Inform. 2011;44(6):989–96. doi:10.​1016/​j.​jbi.​2011.​07.​005.PubMed
84.
go back to reference Bian J, Topaloglu U, Yu F. Towards large-scale twitter mining for drug-related adverse events. In: Proceedings of the 2012 International Workshop on Smart Health and Wellbeing; Maui, HI. 2389713: ACM; 2012: pp. 25–32. Bian J, Topaloglu U, Yu F. Towards large-scale twitter mining for drug-related adverse events. In: Proceedings of the 2012 International Workshop on Smart Health and Wellbeing; Maui, HI. 2389713: ACM; 2012: pp. 25–32.
85.
go back to reference Jiang K, Zheng Y. Mining twitter data for potential drug effects. In: Motoda H, Wu Z, Cao L, Zaiane O, Yao M, Wang W, editors. Advanced data mining and applications. Lecture notes in computer science. Springer: Berlin; 2013. p. 434–43.CrossRef Jiang K, Zheng Y. Mining twitter data for potential drug effects. In: Motoda H, Wu Z, Cao L, Zaiane O, Yao M, Wang W, editors. Advanced data mining and applications. Lecture notes in computer science. Springer: Berlin; 2013. p. 434–43.CrossRef
86.
go back to reference Pimpalkhute P, Patki A, Nikfarjam A, Gonzalez G. Phonetic spelling filter for keyword selection in drug mention mining from social media. AMIA TBI Summit. 2014. Pimpalkhute P, Patki A, Nikfarjam A, Gonzalez G. Phonetic spelling filter for keyword selection in drug mention mining from social media. AMIA TBI Summit. 2014.
90.
91.
go back to reference White RW, Harpaz R, Shah NH, DuMouchel W, Horvitz E. Toward enhanced pharmacovigilance using patient-generated data on the internet. Clin Pharmacol Ther. 2014;96(2):239–46. White RW, Harpaz R, Shah NH, DuMouchel W, Horvitz E. Toward enhanced pharmacovigilance using patient-generated data on the internet. Clin Pharmacol Ther. 2014;96(2):239–46.
92.
go back to reference Tatonetti NP, Denny JC, Murphy SN, Fernald GH, Krishnan G, Castro V, et al. Detecting drug interactions from adverse-event reports: interaction between paroxetine and pravastatin increases blood glucose levels. Clin Pharmacol Ther. 2011;90(1):133–142.CrossRefPubMedCentralPubMed Tatonetti NP, Denny JC, Murphy SN, Fernald GH, Krishnan G, Castro V, et al. Detecting drug interactions from adverse-event reports: interaction between paroxetine and pravastatin increases blood glucose levels. Clin Pharmacol Ther. 2011;90(1):133–142.CrossRefPubMedCentralPubMed
Metadata
Title
Text Mining for Adverse Drug Events: the Promise, Challenges, and State of the Art
Authors
Rave Harpaz
Alison Callahan
Suzanne Tamang
Yen Low
David Odgers
Sam Finlayson
Kenneth Jung
Paea LePendu
Nigam H. Shah
Publication date
01-10-2014
Publisher
Springer International Publishing
Published in
Drug Safety / Issue 10/2014
Print ISSN: 0114-5916
Electronic ISSN: 1179-1942
DOI
https://doi.org/10.1007/s40264-014-0218-z

Other articles of this Issue 10/2014

Drug Safety 10/2014 Go to the issue