Sign-Board-Reader
This is a python based NLP project to recognize English sign/info boards and translate it into Indian Regional Languages with voice support. The app provides support in 5 Indian Languages including Tamil, Telugu, Hinid, Kannada and Marathi.
Architecture
Necesaary Libraries
- OpenCV: OpenCV (Open Source Computer Vision) is a popular open source computer vision library used for processing images and videos. It provides a range of algorithms for tasks such as image processing, feature detection, object recognition, and more. OpenCV is widely used in fields such as robotics, augmented reality, and image processing.
- Pytesseract: Pytesseract is an OCR (Optical Character Recognition) engine based on Google's Tesseract OCR engine. It is a Python wrapper that allows developers to use Tesseract OCR in their Python applications. Pytesseract is used to recognize text in images and convert it into machine-readable text format.
- Googletrans: Googletrans is a Python library for translating text using the Google Translate API. It provides a simple and easy-to-use interface for translating text between different languages. Googletrans supports over 100 languages. In this project, it is used to translate the recognized English text into regional languages.
- gTTS: gTTS (Google Text-to-Speech) is a Python library that allows developers to generate speech from text using Google's text-to-speech API. gTTS supports several languages and provides a simple and easy-to-use interface for generating speech from text. It is used to generate audio for the translated text.
- Spacy: Spacy is an open-source NLP library for Python that provides a range of NLP capabilities such as named entity recognition, part-of-speech tagging, dependency parsing, and more. Spacy is designed to be fast, efficient, and easy to use.
- NLTK: NLTK (Natural Language Toolkit) is a popular open-source NLP library for Python. It provides a range of NLP capabilities such as 16 tokenization, stemming, lemmatization, part-of-speech tagging, and more. NLTK is widely used in research and academia and provides a large collection of corpora, lexical resources, and datasets.