mathanamathav/NLP-Tablets-Annotation

College NLP Hackathon

Jupyter NotebookMIT

NLP-Tablets-Annotation

Given a set of tablets images, do OCR, convert the image to text, extract necessary details such as name of medicine, molecules in it, date of manufacturing, date of expiry. Convert this text into speech. This can be done by creating a drug database by scraping drug details and form a lexicon. Can use api for text to speech conversion

Checkout the report made for this project here.

Download the required model from here add it to model folder.

Tasks Done

Datasets Used

Libraries Used

PaddleOCR - For Text Extraction from Image
Spacy - For Training NER Model
CV2 - For Image processing
NLTK - For Text processing
TTS - Google API for text to speech conversion
Streamlit - For building Web Application

Colab Links