/clean-ocr

colored zone-segmented OCR

Primary LanguagePython

clean-ocr

Open image -> draw over zone of interest (drag&drop) -> get text.

Sometimes you just don't need to scan the whole image.
And cut it into another file is a bit annoying.
So use this script.

Installation

1 - Download the whole project
2 - Open the terminal at the project's folder
3 - run pip install -r requirements.txt
4 - important, you need to install Tesseract OCR as well, see below

Usage

5 - run python clean-ocr.py $path_to_your_image_here
example -> python clean-ocr.py C:\users\image.jpg
6 - After your image is loaded, drag and drop forming as many rectangles as you like, selecting the zones of interest
7 - And just press enter!

ESCAPE KEY        -> close
ENTER KEY          -> get text from selected areas
BACKSPACE KEY   -> start again

alt text

Uses Tesseract OCR

Project: https://github.com/tesseract-ocr/tesseract
Windows installer: https://github.com/UB-Mannheim/tesseract/wiki
Other installation methods: https://tesseract-ocr.github.io/tessdoc/Installation.html