https://arxiv.org/abs/1703.01086
We re-implement RRPN in pytorch 1.0! View https://github.com/mjq11302010044/RRPN_pytorch for more details.
RRPN is released under the MIT License (refer to the LICENSE file for details). This project is for research purpose only, further use for RRPN should contact authors.
If you find RRPN useful in your research, please consider citing:
@article{Jianqi17RRPN,
Author = {Jianqi Ma and Weiyuan Shao and Hao Ye and Li Wang and Hong Wang and Yingbin Zheng and Xiangyang Xue},
Title = {Arbitrary-Oriented Scene Text Detection via Rotation Proposals},
journal = {IEEE Transactions on Multimedia},
volume={20},
number={11},
pages={3111-3122},
year={2018}
}
- Requirements: software
- Requirements: hardware
- Basic installation
- Demo
- Beyond the demo: training and testing
- Requirements for
Caffe
andpycaffe
(see: Caffe installation instructions)
Note: Caffe must be built with support for Python layers!
# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
# Unrelatedly, it's also recommended that you use CUDNN
USE_CUDNN := 1
You can download my Makefile.config for reference.
2. Python packages you might not have: cython
, python-opencv
, easydict
- For training the end-to-end version of RRPN with VGG16, 4~5G of GPU memory is sufficient (using CUDNN)
- Clone the RRPN repository
# git clone https://github.com/mjq11302010044/RRPN.git
-
We'll call the directory that you cloned RRPN into
RRPN_ROOT
-
Build the Cython modules
cd $RRPN_ROOT/lib make
-
Build Caffe and pycaffe
cd $RRPN_ROOT/caffe-fast-rcnn # Now follow the Caffe installation instructions here: # http://caffe.berkeleyvision.org/installation.html # If you're experienced with Caffe and have all of the requirements installed # and your Makefile.config in place, then simply do: make -j4 && make pycaffe
-
Download pre-computed RRPN detectors
Trained VGG16 model download link: https://drive.google.com/open?id=0B5rKZkZodGIsV2RJUjVlMjNOZkE
Then move the model into
$RRPN_ROOT/data/faster_rcnn_models
.
After successfully completing basic installation, you'll be ready to run the demo.
To run the demo
cd $RRPN_ROOT
python ./tools/rotation_demo.py
The txt results will be saved in $RRPN_ROOT/result
You can use the function get_rroidb()
in $RRPN_ROOT/lib/rotation/data_extractor.py
to manage your training data:
Each training sample should be managed in a python dict like:
im_info = {
'gt_classes': # Set to 1(Only text)
'max_classes': # Set to 1(Only text)
'image': # image path to access
'boxes': # ground truth box
'flipped' : # Flip an image or not (Not implemented)
'gt_overlaps' : # overlap of a class(text)
'seg_areas' : # area of an ground truth region
'height': # height of an image data
'width': # width of an image data
'max_overlaps' : # max overlap with each gt-proposal
'rotated': # Random angle to rotate an image
}
Then assign your database to the variable 'roidb' in main function of $RRPN_ROOT/tools/train_net.py
116: roidb = get_rroidb("train") # change to your data manage function
Pre-trained ImageNet models can be downloaded for the networks described in the paper: VGG16.
cd $RRPN_ROOT
./data/scripts/fetch_imagenet_models.sh
VGG16 comes from the Caffe Model Zoo, but is provided here for your convenience. ZF was trained at MSRA.
Then you can train RRPN by typing:
./experiment/scripts/faster_rcnn_end2end.sh [GPU_ID] [NET] rrpn
[NET] usually takes VGG16
Trained RRPN networks are saved under:(We set the directory to './' by default.)
./
One can change the directory in variable output_dir
in $RRPN_ROOT/tools/train_net.py
Any question about this project please send message to Jianqi Ma(mjq11302010044@gmail.com), and enjoy it!