Published in:
Open Access
01-12-2020 | Opioids | Research article
DLI-IT: a deep learning approach to drug label identification through image and text embedding
Authors:
Xiangwen Liu, Joe Meehan, Weida Tong, Leihong Wu, Xiaowei Xu, Joshua Xu
Published in:
BMC Medical Informatics and Decision Making
|
Issue 1/2020
Login to get access
Abstract
Background
Drug label, or packaging insert play a significant role in all the operations from production through drug distribution channels to the end consumer. Image of the label also called Display Panel or label could be used to identify illegal, illicit, unapproved and potentially dangerous drugs. Due to the time-consuming process and high labor cost of investigation, an artificial intelligence-based deep learning model is necessary for fast and accurate identification of the drugs.
Methods
In addition to image-based identification technology, we take advantages of rich text information on the pharmaceutical package insert of drug label images. In this study, we developed the Drug Label Identification through Image and Text embedding model (DLI-IT) to model text-based patterns of historical data for detection of suspicious drugs. In DLI-IT, we first trained a Connectionist Text Proposal Network (CTPN) to crop the raw image into sub-images based on the text. The texts from the cropped sub-images are recognized independently through the Tesseract OCR Engine and combined as one document for each raw image. Finally, we applied universal sentence embedding to transform these documents into vectors and find the most similar reference images to the test image through the cosine similarity.
Results
We trained the DLI-IT model on 1749 opioid and 2365 non-opioid drug label images. The model was then tested on 300 external opioid drug label images, the result demonstrated our model achieves up-to 88% of the precision in drug label identification, which outperforms previous image-based or text-based identification method by up-to 35% improvement.
Conclusion
To conclude, by combining Image and Text embedding analysis under deep learning framework, our DLI-IT approach achieved a competitive performance in advancing drug label identification.