text-detection-recognize-ctpn-tesseract

custom from repo: https://github.com/eragonruan/text-detection-ctpn with Tesseract text recognize for each detected box

setup

nms and bbox utils are written in cython, hence you have to build the library first.

cd utils/bbox
chmod +x make.sh
./make.sh

It will generate a nms.so and a bbox.so in current folder.

download the ckpt file from googl drive or baidu yun
put "checkpoints_mlt/" in "text-detection-ctpn/"
put your images in "data/demo", output image and text in "data/res", and run demo in the root

python main/demo.py

First, download the pre-trained model of VGG net and put it in data/vgg_16.ckpt. you can download it from tensorflow/models
Second, download the dataset we prepared from google drive or baidu yun. put the downloaded data in data/dataset/mlt, then start the training.
Also, you can prepare your own dataset according to the following steps.
Modify the DATA_FOLDER and OUTPUT in utils/prepare/split_label.py according to your dataset. And run split_label.py in the root

python ./utils/prepare/split_label.py

it will generate the prepared data in data/dataset/
The input file format demo of split_label.py can be found in gt_img_859.txt. And the output file of split_label.py is img_859.txt. A demo image of the prepared data is shown below.

Simplely run

python main/train.py

The model provided in checkpoints_mlt is trained on GTX1070 for 50k iters. It takes about 0.25s per iter. So it will takes about 3.5 hours to finished 50k iterations.

NOTICE: all the photos used below are collected from the internet. If it affects you, please contact me to delete them.