terpiljenya/ucu-summer-ocr-1

C++

END-TO-END pipeline for text detection and recognition

originals:

Text detection - EAST
Text recognition - CRNN

Download

Download pretrained models:

frozen tensorflow EAST model here [97M]
pretrained CRNN English model from here [34M]

and put it to pretrained_models folder

Download datasets:

[optional] Small train - here [529M]
Validation - here [118M]

and put it to data folder

Run

test image - run_demo_server.py and open http://0.0.0.0:8769/
validation - validation.py

	Baseline (English pretrained model)	Benchmark (25 epochs on 80K SynthText)
Char precision	0.1569	0.3218
Word precision	0.1017	0.1175

Usefull links

CRNN

paper - https://arxiv.org/abs/1507.05717
pytorch (YouScan port) - https://github.com/YouScan/crnn.pytorch
pytorch chinese + generation - https://github.com/Sierkinhane/crnn_chinese_characters_rec
tensorflow - https://github.com/MaybeShewill-CV/CRNN_Tensorflow
keras - https://github.com/Tony607/keras-image-ocr

Generate synthetic dataset

SynthText (YouScan port) - https://github.com/YouScan/SynthText
TextRecognitionDataGenerator - https://github.com/Belval/TextRecognitionDataGenerator
pytorch CRNN chinese + generation - https://github.com/Sierkinhane/crnn_chinese_characters_rec

Other OCR Links

Papers/repositories/tools about text detection and recognitions: