/opencv-text-recognition

OpenCV text recognition using Tesseract

Primary LanguagePython

OpenCV + Tesseract Text Recognition

This project is extracted from this amazing blog post of Adrian Rosebrock at Pyimagesearch.

In addition I have add a requirements.txt file with Python dependencies and also apply some minimal changes due to improve code cleanness.

Installation

For installation first we need to install Tesseract. To do so execute the following command:

sudo apt install tesseract-ocr

You can easy verify if everything was correct installed checking version as follows:

tesseract -v

tesseract 4.0.0
 leptonica-1.76.0
  libgif 5.1.4 : libjpeg 6b (libjpeg-turbo 1.5.2) : libpng 1.6.36 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
 Found AVX2
 Found AVX
 Found SSE

Once tesseract is installed, you can proceed installing Python dependencies. To do so execute the following command:

pip install -r requirements.txt

Run

The following are examples of invocations:

python text_recognition.py --east frozen_east_text_detection.pb 
	--image images/example_01.jpg

Adding 5% of padding to bounding boxes

python text_recognition.py --east frozen_east_text_detection.pb --image images/example_05.jpg --padding 0.05

Commands

References

  1. OpenCV OCR and text recognition with Tesseract https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/

  2. Tesseract the complete list of available idioms https://tesseract-ocr.github.io/tessdoc/Data-Files