deep-text-recognition-benchmark: A Jupyter Notebook repository from PhiDCH

Getting Started

Dependency

This work was tested with PyTorch 1.3.1, CUDA 10.1, python 3.6 and Ubuntu 16.04.
You may need pip3 install torch==1.3.1.
In the paper, expriments were performed with PyTorch 0.4.1, CUDA 9.0.
requirements : lmdb, pillow, torchvision, nltk, natsort

pip3 install lmdb pillow torchvision nltk natsort

When you need to train on your own dataset or Non-Latin language datasets.

Create your own lmdb dataset.

pip3 install fire
python3 create_lmdb_dataset.py --inputPath data/ --gtFile data/gt.txt --outputPath result/

The structure of data folder as below.

data
├── gt.txt
└── test
    ├── word_1.png
    ├── word_2.png
    ├── word_3.png
    └── ...

At this time, gt.txt should be {imagepath}\t{label}\n
For example

test/word_1.png Tiredness
test/word_2.png kills
test/word_3.png A
...

Modify --select_data, --batch_ratio, and opt.character, see this issue.

Training and evaluation

Download prtrain model

gdown --id 1GXFr31EFqnFPjgITJNklqLO3Dholo6VW -O pretrain.pth

Train CRNN[10] model

python -W ignore train.py \
--train_data data/train_data --valid_data data/valid_data \
--Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn \
--num_iter 30000 \
--batch_size 150 \
--imgW 100 \
--imgH 32 \
--workers 0 \
--batch_max_length 80 \
--valInterval 500 \
--exp_name CRNN_batch1_sens \
--saved_model pretrain.pth

Arguments

--train_data: folder path to training lmdb dataset.
--valid_data: folder path to validation lmdb dataset.
--eval_data: folder path to evaluation (with test.py) lmdb dataset.
--select_data: select training data. default is MJ-ST, which means MJ and ST used as training data.
--batch_ratio: assign ratio for each selected data in the batch. default is 0.5-0.5, which means 50% of the batch is filled with MJ and the other 50% of the batch is filled ST.
--data_filtering_off: skip data filtering when creating LmdbDataset.
--Transformation: select Transformation module [None | TPS].
--FeatureExtraction: select FeatureExtraction module [VGG | RCNN | ResNet].
--SequenceModeling: select SequenceModeling module [None | BiLSTM].
--Prediction: select Prediction module [CTC | Attn].
--saved_model: assign saved model to evaluation.
--benchmark_all_eval: evaluate with 10 evaluation dataset versions, same with Table 1 in our paper.

See Colab.

PhiDCH/deep-text-recognition-benchmark

Getting Started

Dependency

When you need to train on your own dataset or Non-Latin language datasets.

Training and evaluation

Arguments