/text-detection-and-recognition

Parsing text data in images using opencv and EAST algorithm

Primary LanguageC++

Text Detection and Recognition

This project focusses on using OpenCV to detect text in images using the EAST text detector. The bounding box be obtained for individual texts as well as lines. Once the text is detected, we then use tesseract C++ api to recognize/extract the detected text.

REQUIREMENTS

  1. Ubuntu 16.04/18.04
  2. C++ (g++ compiler)
  3. Tesseract
  4. OpenCV

USAGE

Run the file text_recognition.cpp as below

g++ text_recognition.cpp -lboost_system -lcrypto -lssl -lcpprest `pkg-config --cflags --libs tesseract opencv`

./a.out -i=sample.jpg -m=frozen_east_text_detection.pb -d=line

RESULTS

Text detection output:

Text recognition output:

{ "text": "HEALTHY FOOD MENU!", "boundingBox": [ { "x": "280.38", "y": "62.87" }, { "x": "280.13", "y": "29.76" }, { "x": "686.98", "y": "27.60" }, { "x": "687.45", "y": "62.72" } ] }

REFERENCES