Tinzyl/text_extraction_from_image

Jupyter NotebookMIT

text_extraction_from_image

In this project, I used Tesseract and OpenCV to create a program that extracts text from images. The steps which were taken in developing this program were as follows:

Load the image, resize it and then save it.
Use the PyTesseract Library to extract the text from the image.
Process the text extracted from the image.
Used Computer Vision to perform further processing of complex images.
Remove noise from the image using the blur function.
Perform threshold transformation of the image.
Use the erode function of cv2 on the image.
Performed other image processing operations.
Draw a rectangle around a character/word/pattern.

Results

Original Image:

Text Extracted Using PyTesseract:

The results were not quite accurate as it failed to get to 54170 to join our text club.

Rectangle Drawn Around Text After Performing Computer Vision operations:

After performing Computer Vision Operations using OpenCV, the results were accurate as a rectangle was drawn around the text.