This is an application for scene text detection (TextBoxes++) and recognition (CRNN).
TextBoxes++ is a unified framework for oriented scene text detection with a single network. It is an extended work of TextBoxes. CRNN is an open-source text recognizer. The code of TextBoxes++ is based on SSD and TextBoxes. The code of CRNN is modified from CRNN.
For more details, please refer to our arXiv paper.
Please cite the related works in your publications if it helps your research:
@article{Liao2018Text,
title = {{TextBoxes++}: A Single-Shot Oriented Scene Text Detector},
author = {Minghui Liao, Baoguang Shi and Xiang Bai},
journal = {{IEEE} Transactions on Image Processing},
doi = {10.1109/TIP.2018.2825107},
url = {https://doi.org/10.1109/TIP.2018.2825107},
volume = {27},
number = {8},
pages = {3676--3690},
year = {2018}
}
@inproceedings{LiaoSBWL17,
author = {Minghui Liao and
Baoguang Shi and
Xiang Bai and
Xinggang Wang and
Wenyu Liu},
title = {TextBoxes: {A} Fast Text Detector with a Single Deep Neural Network},
booktitle = {AAAI},
year = {2017}
}
@article{ShiBY17,
author = {Baoguang Shi and
Xiang Bai and
Cong Yao},
title = {An End-to-End Trainable Neural Network for Image-Based Sequence Recognition
and Its Application to Scene Text Recognition},
journal = {{IEEE} TPAMI},
volume = {39},
number = {11},
pages = {2298--2304},
year = {2017}
}
NOTE There is partial support for a docker image. See docker/README.md
. (Thank you for the PR from @mdbenito)
Torch7 for CRNN;
g++-5; cuda8.0; cudnn V5.1 (cudnn 6 and cudnn 7 may fail); opencv3.0
Please refer to Caffe Installation to ensure other dependencies;
- compile TextBoxes++ (This is a modified version of caffe so you do not need to install the official caffe)
# Modify Makefile.config according to your Caffe installation.
cp Makefile.config.example Makefile.config
make -j8
# Make sure to include $CAFFE_ROOT/python to your PYTHONPATH.
make py
- compile CRNN (Please refer to CRNN if you have trouble with the compilation.)
cd crnn/src/
sh build_cpp.sh
-
pre-trained model on SynthText (used for training): Dropbox; BaiduYun
-
model trained on ICDAR 2015 Incidental Text (used for testing): Dropbox; BaiduYun
Please place the above models in "./models/"
If your data is hugely different from ICDAR 2015 Incidental Text,you'd better train it on your own data based on the pre-trained model on SynthText.
-
Please place the crnn model in "./crnn/model/"
Download the ICDAR 2015 model and place it in "./models/"
python examples/text/demo.py
The detection results and recognition results are in "./demo_images"
-
convert ground truth into "xml" form: example.xml
-
create train/test lists (train.txt / test.txt) in "./data/text/" with the following form:
path_to_example1.jpg path_to_example1.xml path_to_example2.jpg path_to_example2.xml
-
Run "./data/text/creat_data.sh"
1. modify the lmdb path in modelConfig.py
2. Run "python examples/text/train.py"