Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2012

Open Access 01-12-2012 | Research article

Automated systems to identify relevant documents in product risk management

Authors: Xue Ting Wee, Yvonne Koh, Chun Wei Yap

Published in: BMC Medical Informatics and Decision Making | Issue 1/2012

Login to get access

Abstract

Background

Product risk management involves critical assessment of the risks and benefits of health products circulating in the market. One of the important sources of safety information is the primary literature, especially for newer products which regulatory authorities have relatively little experience with. Although the primary literature provides vast and diverse information, only a small proportion of which is useful for product risk assessment work. Hence, the aim of this study is to explore the possibility of using text mining to automate the identification of useful articles, which will reduce the time taken for literature search and hence improving work efficiency. In this study, term-frequency inverse document-frequency values were computed for predictors extracted from the titles and abstracts of articles related to three tumour necrosis factors-alpha blockers. A general automated system was developed using only general predictors and was tested for its generalizability using articles related to four other drug classes. Several specific automated systems were developed using both general and specific predictors and training sets of different sizes in order to determine the minimum number of articles required for developing such systems.

Results

The general automated system had an area under the curve value of 0.731 and was able to rank 34.6% and 46.2% of the total number of 'useful' articles among the first 10% and 20% of the articles presented to the evaluators when tested on the generalizability set. However, its use may be limited by the subjective definition of useful articles. For the specific automated system, it was found that only 20 articles were required to develop a specific automated system with a prediction performance (AUC 0.748) that was better than that of general automated system.

Conclusions

Specific automated systems can be developed rapidly and avoid problems caused by subjective definition of useful articles. Thus the efficiency of product risk management can be improved with the use of specific automated systems.
Appendix
Available only for authorised users
Literature
1.
go back to reference Bull J: US Activities in Risk Management of Pharmaceutical Products. Pharmacovigilance. Edited by: Mann RD, Andrews EB. 2007, Chichester, West Sussex, England: John Wiley & Sons, 2 Bull J: US Activities in Risk Management of Pharmaceutical Products. Pharmacovigilance. Edited by: Mann RD, Andrews EB. 2007, Chichester, West Sussex, England: John Wiley & Sons, 2
4.
go back to reference Spasic I, Ananiadou S, McNaught J, Kumar A: Text mining and ontologies in biomedicine: making sense of raw text. Brief Bioinform. 2005, 6: 239-251. 10.1093/bib/6.3.239.CrossRefPubMed Spasic I, Ananiadou S, McNaught J, Kumar A: Text mining and ontologies in biomedicine: making sense of raw text. Brief Bioinform. 2005, 6: 239-251. 10.1093/bib/6.3.239.CrossRefPubMed
5.
go back to reference Grieser L, Hippner H, Wilde KD: E-Mail bounce management using text mining. 42nd Hawaii International Conference on System Sciences. 2009, 1-10. Grieser L, Hippner H, Wilde KD: E-Mail bounce management using text mining. 42nd Hawaii International Conference on System Sciences. 2009, 1-10.
6.
go back to reference Sahami M, Dumais S, Heckerman D, Horvitz E: A bayesian approach to filtering junk e-mail. AAAAI'98 Workshop on Learning for Text Categorisation. 1998, 55-62. Sahami M, Dumais S, Heckerman D, Horvitz E: A bayesian approach to filtering junk e-mail. AAAAI'98 Workshop on Learning for Text Categorisation. 1998, 55-62.
7.
go back to reference Sakurai S, Ueno K: Analysis of daily business reports based on sequential text mining method. IEEE International Conference on Systems, Man and Cybernetics. 2004, 2004: 3279-3284. Sakurai S, Ueno K: Analysis of daily business reports based on sequential text mining method. IEEE International Conference on Systems, Man and Cybernetics. 2004, 2004: 3279-3284.
8.
go back to reference Ticom AAM, de Souza B, de Lima LP: Text mining and expert systems applied in labor laws. Seventh International Conference on Intelligent Systems Design and Applications. 2007, 2007: 788-792.CrossRef Ticom AAM, de Souza B, de Lima LP: Text mining and expert systems applied in labor laws. Seventh International Conference on Intelligent Systems Design and Applications. 2007, 2007: 788-792.CrossRef
9.
go back to reference Lu Z: PubMed and beyond: a survey of web tools for searching biomedical literature. Database (Oxford). 2011, 2011: baq036-CrossRef Lu Z: PubMed and beyond: a survey of web tools for searching biomedical literature. Database (Oxford). 2011, 2011: baq036-CrossRef
10.
go back to reference Trieschnigg D, Pezik P, Lee V, de Jong F, Kraaij W, Rebholz-Schuhmann D: MeSH Up: effective MeSH text classification for improved document retrieval. Bioinformatics. 2009, 25 (11): 1412-1418. 10.1093/bioinformatics/btp249.CrossRefPubMedPubMedCentral Trieschnigg D, Pezik P, Lee V, de Jong F, Kraaij W, Rebholz-Schuhmann D: MeSH Up: effective MeSH text classification for improved document retrieval. Bioinformatics. 2009, 25 (11): 1412-1418. 10.1093/bioinformatics/btp249.CrossRefPubMedPubMedCentral
11.
go back to reference Agarwal S, Yu H: Automatically classifying sentences in full-text biomedical articles into Introduction, Methods, Results and Discussion. Bioinformatics. 2009, 25 (23): 3174-3180. 10.1093/bioinformatics/btp548.CrossRefPubMedPubMedCentral Agarwal S, Yu H: Automatically classifying sentences in full-text biomedical articles into Introduction, Methods, Results and Discussion. Bioinformatics. 2009, 25 (23): 3174-3180. 10.1093/bioinformatics/btp548.CrossRefPubMedPubMedCentral
12.
go back to reference Wang P, Morgan A, Zhang Q, Sette A, Peters B: Automating document classification for the Immune Epitope Database. BMC Bioinforma. 2007, 8 (1): 269-10.1186/1471-2105-8-269.CrossRef Wang P, Morgan A, Zhang Q, Sette A, Peters B: Automating document classification for the Immune Epitope Database. BMC Bioinforma. 2007, 8 (1): 269-10.1186/1471-2105-8-269.CrossRef
13.
go back to reference Amardeilh F, Bousquet C, Guillemin-Lanne S, Wiss-Thebault M, Guillot L, Delamarre D, Lillo-Le Louet A, Burgun A: A knowledge management platform for documentation of case reports in pharmacovigilance. Stud Health Technol Inform. 2009, 150: 517-521.PubMed Amardeilh F, Bousquet C, Guillemin-Lanne S, Wiss-Thebault M, Guillot L, Delamarre D, Lillo-Le Louet A, Burgun A: A knowledge management platform for documentation of case reports in pharmacovigilance. Stud Health Technol Inform. 2009, 150: 517-521.PubMed
14.
go back to reference Lacy CF, Armstrong LL, Goldman MP, Lance LL: Drug Information Handbook with International Trade Names Index. 2009, Hudson, Ohio: Lexi-Comp Inc, 18 Lacy CF, Armstrong LL, Goldman MP, Lance LL: Drug Information Handbook with International Trade Names Index. 2009, Hudson, Ohio: Lexi-Comp Inc, 18
15.
go back to reference Caviglia R, Boskoski I, Cicala M: Long-term treatment with infliximab in inflammatory bowel disease: safety and tolerability issues. Expert Opin Drug Saf. 2008, 7 (5): 617-632. 10.1517/14740338.7.5.617.CrossRefPubMed Caviglia R, Boskoski I, Cicala M: Long-term treatment with infliximab in inflammatory bowel disease: safety and tolerability issues. Expert Opin Drug Saf. 2008, 7 (5): 617-632. 10.1517/14740338.7.5.617.CrossRefPubMed
16.
go back to reference Garcia-Vidal C, Rodriguez-Fernandez S, Teijon S, Esteve M, Rodriguez-Carballeira M, Lacasa JM, Salvador G, Garau J: Risk factors for opportunistic infections in infliximab-treated patients: the importance of screening in prevention. Eur J Clin Microbiol Infect Dis. 2009, 28 (4): 331-337. 10.1007/s10096-008-0628-x.CrossRefPubMed Garcia-Vidal C, Rodriguez-Fernandez S, Teijon S, Esteve M, Rodriguez-Carballeira M, Lacasa JM, Salvador G, Garau J: Risk factors for opportunistic infections in infliximab-treated patients: the importance of screening in prevention. Eur J Clin Microbiol Infect Dis. 2009, 28 (4): 331-337. 10.1007/s10096-008-0628-x.CrossRefPubMed
17.
go back to reference Carter JD, Gerard HC, Hudson AP: Psoriasiform lesions induced by tumour necrosis factor antagonists: a skin-deep medical conundrum. Ann Rheum Dis. 2008, 67 (8): 1181-1183. 10.1136/ard.2007.082842.CrossRefPubMed Carter JD, Gerard HC, Hudson AP: Psoriasiform lesions induced by tumour necrosis factor antagonists: a skin-deep medical conundrum. Ann Rheum Dis. 2008, 67 (8): 1181-1183. 10.1136/ard.2007.082842.CrossRefPubMed
18.
go back to reference Komatsuda A, Wakui H, Nimura T, Sawada K: Reversible infliximab-related lymphoproliferative disorder associated with Epstein-Barr virus in a patient with rheumatoid arthritis. Mod Rheumatol. 2008, 18 (3): 315-318. 10.1007/s10165-008-0053-0.CrossRefPubMed Komatsuda A, Wakui H, Nimura T, Sawada K: Reversible infliximab-related lymphoproliferative disorder associated with Epstein-Barr virus in a patient with rheumatoid arthritis. Mod Rheumatol. 2008, 18 (3): 315-318. 10.1007/s10165-008-0053-0.CrossRefPubMed
19.
go back to reference Nakashima C, Tanioka M, Takahashi K, Miyachi Y: Diffuse large B-cell lymphoma in a patient with rheumatoid arthritis treated with infliximab and methotrexate. Clin Exp Dermatol. 2008, 33 (4): 437-439. 10.1111/j.1365-2230.2007.02683.x.CrossRefPubMed Nakashima C, Tanioka M, Takahashi K, Miyachi Y: Diffuse large B-cell lymphoma in a patient with rheumatoid arthritis treated with infliximab and methotrexate. Clin Exp Dermatol. 2008, 33 (4): 437-439. 10.1111/j.1365-2230.2007.02683.x.CrossRefPubMed
20.
go back to reference Lee M, Wang W, Yu H: Exploring supervised and unsupervised methods to detect topics in biomedical text. BMC Bioinforma. 2006, 7 (1): 140-10.1186/1471-2105-7-140.CrossRef Lee M, Wang W, Yu H: Exploring supervised and unsupervised methods to detect topics in biomedical text. BMC Bioinforma. 2006, 7 (1): 140-10.1186/1471-2105-7-140.CrossRef
21.
go back to reference Japkowicz N: The class imbalance problem: Significance and strategies. Proceedings of the 2000 International Conference on Artificial Intelligencel. 2000, 2000: 111-117. Japkowicz N: The class imbalance problem: Significance and strategies. Proceedings of the 2000 International Conference on Artificial Intelligencel. 2000, 2000: 111-117.
22.
go back to reference Qiong G, Zhihua C, Li Z, Bo H: Data mining on imbalanced data sets. International Conference on Advanced Computer Theory and Engineering. 2008, 2008: 1020-1024. Qiong G, Zhihua C, Li Z, Bo H: Data mining on imbalanced data sets. International Conference on Advanced Computer Theory and Engineering. 2008, 2008: 1020-1024.
23.
go back to reference Müller H-M, Kenny EE, Sternberg PW: Textpresso: an ontology-based information retrieval and extraction system for biological literature. PLoS Biol. 2004, 2 (11): e309-10.1371/journal.pbio.0020309.CrossRefPubMedPubMedCentral Müller H-M, Kenny EE, Sternberg PW: Textpresso: an ontology-based information retrieval and extraction system for biological literature. PLoS Biol. 2004, 2 (11): e309-10.1371/journal.pbio.0020309.CrossRefPubMedPubMedCentral
24.
go back to reference Hai H, Leibman MN, Mural RJ: Biomedical informatics in transitional researc. 2008, Boston and London: Artech House, 1 Hai H, Leibman MN, Mural RJ: Biomedical informatics in transitional researc. 2008, Boston and London: Artech House, 1
25.
go back to reference Knuth DE: Semantics of context-free languages. Theory of Computing Systems. 1968, 2 (2): 127-145. Knuth DE: Semantics of context-free languages. Theory of Computing Systems. 1968, 2 (2): 127-145.
27.
go back to reference Zeng Q, Cimino JJ: Automated knowledge extraction from the UMLS. Proc Amia Symp. 1998, 568-572. Zeng Q, Cimino JJ: Automated knowledge extraction from the UMLS. Proc Amia Symp. 1998, 568-572.
28.
go back to reference Chen ES, Hripcsak G, Xu H, Markatou M, Friedman C: Automated acquisition of disease drug knowledge from biomedical and clinical documents: an initial study. J Am Med Inform Assoc. 2008, 15 (1): 87-98.CrossRefPubMedPubMedCentral Chen ES, Hripcsak G, Xu H, Markatou M, Friedman C: Automated acquisition of disease drug knowledge from biomedical and clinical documents: an initial study. J Am Med Inform Assoc. 2008, 15 (1): 87-98.CrossRefPubMedPubMedCentral
29.
go back to reference Carter JD, Ladhani A, Ricca LR, Valeriano J, Vasey FB: A safety assessment of tumor necrosis factor antagonists during pregnancy: a review of the food and drug administration database. J Rheumatol. 2009, 36 (3): 635-641. 10.3899/jrheum.080545.CrossRefPubMed Carter JD, Ladhani A, Ricca LR, Valeriano J, Vasey FB: A safety assessment of tumor necrosis factor antagonists during pregnancy: a review of the food and drug administration database. J Rheumatol. 2009, 36 (3): 635-641. 10.3899/jrheum.080545.CrossRefPubMed
30.
go back to reference Statistica 64. 2009, United States: Statsoft Inc, 9.0 Statistica 64. 2009, United States: Statsoft Inc, 9.0
31.
go back to reference Omniviz. 2009, Cambridge, United Kingdom: Biowisdom, 6 Omniviz. 2009, Cambridge, United Kingdom: Biowisdom, 6
32.
go back to reference Manning CD, Schutze H: Foundations of statistical natural language processing. 1999, London, England; Cambridge, Massachusetts: The MIT Press Manning CD, Schutze H: Foundations of statistical natural language processing. 1999, London, England; Cambridge, Massachusetts: The MIT Press
33.
go back to reference Hosmer D, Lemeshow S: Applied logistic regression. 1989, New York: Wiley Hosmer D, Lemeshow S: Applied logistic regression. 1989, New York: Wiley
34.
go back to reference Fix E, Hodges JL: Discriminatory analysis: non-parametric discrimination: consistency properties. 1951, Texas: USAF School of Aviation Medicine, Randolph Field, 261-279. Fix E, Hodges JL: Discriminatory analysis: non-parametric discrimination: consistency properties. 1951, Texas: USAF School of Aviation Medicine, Randolph Field, 261-279.
35.
go back to reference Vapnik VN: The nature of statistical learning theory. 1995, New York: SpringerCrossRef Vapnik VN: The nature of statistical learning theory. 1995, New York: SpringerCrossRef
36.
go back to reference Burges CJC: A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc. 1998, 2 (2): 127-167.CrossRef Burges CJC: A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc. 1998, 2 (2): 127-167.CrossRef
37.
go back to reference RapidMiner. 2009, Dortmund, Germany: Rapid-I, 5.0 RapidMiner. 2009, Dortmund, Germany: Rapid-I, 5.0
38.
go back to reference Sun A, Lim E-P, Liu Y: On strategies for imbalanced text classification using SVM: a comparative study. Decis Support Syst. 2009, 48 (1): 191-201. 10.1016/j.dss.2009.07.011.CrossRef Sun A, Lim E-P, Liu Y: On strategies for imbalanced text classification using SVM: a comparative study. Decis Support Syst. 2009, 48 (1): 191-201. 10.1016/j.dss.2009.07.011.CrossRef
39.
go back to reference Kennard RW, Stone L: Computer aided design of experiments. Technometrics. 1969, 11: 137-148. 10.2307/1266770.CrossRef Kennard RW, Stone L: Computer aided design of experiments. Technometrics. 1969, 11: 137-148. 10.2307/1266770.CrossRef
40.
go back to reference Fawcett T: An introduction to ROC analysis. Pattern Recogn Lett. 2006, 27 (8): 861-874. 10.1016/j.patrec.2005.10.010.CrossRef Fawcett T: An introduction to ROC analysis. Pattern Recogn Lett. 2006, 27 (8): 861-874. 10.1016/j.patrec.2005.10.010.CrossRef
41.
go back to reference Lasko TA, Bhagwat JG, Zou KH, Ohno-Machado L: The use of receiver operating characteristic curves in biomedical informatics. J Biomed Inform. 2005, 38 (5): 404-415. 10.1016/j.jbi.2005.02.008.CrossRefPubMed Lasko TA, Bhagwat JG, Zou KH, Ohno-Machado L: The use of receiver operating characteristic curves in biomedical informatics. J Biomed Inform. 2005, 38 (5): 404-415. 10.1016/j.jbi.2005.02.008.CrossRefPubMed
Metadata
Title
Automated systems to identify relevant documents in product risk management
Authors
Xue Ting Wee
Yvonne Koh
Chun Wei Yap
Publication date
01-12-2012
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2012
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/1472-6947-12-13

Other articles of this Issue 1/2012

BMC Medical Informatics and Decision Making 1/2012 Go to the issue