This tutorial is a fine-tuned clone of zeyuanxy's one for the py-faster-rcnn code.
We will illustrate how to train Py-Faster-RCNN on another dataset in the following steps, and we will take the gaze database from UCSD RadLabs as the example dataset.
The current tutorial need you to have clone and tested the regular py-faster-rcnn repository from rbgirshick.
$ git clone https://github.com/rbgirshick/py-faster-rcnn
We will refer to the root directory with $PY_FASTER_RCNN.
You will also need to follow the installation steps from the original py-faster-rcnn readme
Options to label your images include:
- Matlab or
https://github.com/tzutalin/labelImg
But we will use this common architecture for every dataset in $PY_FASTER_RCNN/data
gaze_devkit/
|-- data/
|-- Annotations/
|-- *.txt (Annotation files)
|-- Images/
|-- *.jpg or *.png (Image files)
|-- ImageSets/
|-- train.txt
|-- val.txt
A simple way to achieve it is to use symbolic links: (this is only an example for training, some refactoring will be needing in order to use the testset properly)
$ cd $PY_FASTER_RCNN/data
$ mkdir gaze_devkit/
$ mkdir gaze_devkit/data/
$ ln -s <path/of/gaze/database>/Annotations/ gaze_devkit/data/Annotations
$ ln -s <path/of/gase/database>/Images/ gaze_devkit/data/Images
Now we need to write train.txt
that contains all the names(without extensions) of images files that will be used for training.
Basically with the following:
$ cd $PY_FASTER_RCNN/data/gaze_devkit/data/
$ mkdir ImageSets
$ ls Annotations/ -m | sed s/\\s/\\n/g | sed s/.txt//g | sed s/,//g > ImageSets/train.txt
You need to add a new python file describing the dataset we will use to the directory $PY_FASTER_RCNN/lib/datasets
, see inria.py. Then the following steps should be taken.
- Modify
self._classes
in the constructor function to fit your dataset. - Be careful with the extensions of your image files. See
image_path_from_index
ingaze.py
. - Write the function for parsing annotations. See
_load_gaze_annotation
ingaze.py
. - Do not forget to add
import
syntaxes in your own python file and other python files in the same directory.
Then you should modify the factory.py in the same directory. For example, to add gaze database, we should add
from datasets.gaze import gaze
gaze_devkit_path = '$PY_FASTER_RCNN/data/gaze_devkit'
for split in ['train', 'val']:
name = '{}_{}'.format('gaze', split)
__sets[name] = (lambda split=split: gaze(split, gaze_devkit_path))
NB : $PY_FASTER_RCNN must be replaced by its actual value !
In this example, we will use the model VGG_CNN_M_1024 with alternated optimizations (alt opt). First, you should adapt the solvers in $PY_FASTER_RCNN/models/VGG_CNN_M_1024/faster_rcnn_alt_opt/
NB : If you want to use the end2end method, refer to https://huangying-zhan.github.io/2016/09/22/detection-faster-rcnn.html.
$ cd $PY_FASTER_RCNN/models/
$ mkdir gaze_model/
$ cp -r pascal_voc/VGG_CNN_M_1024/faster_rcnn_alt_opt/ gaze_model/
It mainly concerns with the number of classes you want to train. Let's assume that the number of classes is C (do not forget to count the background
class). Then you should
- Modify num_classes in 'RoIDataLayer' layer to 'C'
- Modify
num_output
in thecls_score
layer to 'C' - Modify
num_output
in thebbox_pred
layer to '4 * C'
In our case we have 12 classes (including background):
$ grep 12 models/gaze_model/faster_rcnn_alt_opt/*.pt
faster_rcnn_test.pt: num_output: 12
stage1_fast_rcnn_train.pt: param_str: "'num_classes': 12"
stage1_fast_rcnn_train.pt: num_output: 12
stage1_rpn_train.pt: param_str: "'num_classes': 12"
stage2_fast_rcnn_train.pt: param_str: "'num_classes': 12"
stage2_fast_rcnn_train.pt: num_output: 12
stage2_rpn_train.pt: param_str: "'num_classes': 12"
$ grep 48 models/gaze_model/faster_rcnn_alt_opt/*.pt
faster_rcnn_test.pt: num_output: 48
stage1_fast_rcnn_train.pt: num_output: 48
stage2_fast_rcnn_train.pt: num_output: 48
The $PY_FASTER_RCNN/models folder must be specified by a config file as in faster_rcnn_alt_opt.yml
$ echo 'MODELS_DIR: "$PY_FASTER_RCNN/models"' >> config.yml
NB : $PY_FASTER_RCNN must be replaced by its actual value !
In the directory $PY_FASTER_RCNN, run the following command in the shell.
$ ./tools/train_faster_rcnn_alt_opt.py --gpu 0 --net_name gaze_model --weights data/imagenet_models/VGG_CNN_M_1024.v2.caffemodel --imdb gaze_train
Where:
--net_name is the folder name in $PY_FASTER_RCNN/models
(nb: the train_faster_rcnn_alt_opt.py script will automatically look into the /faster_rcnn_alt_opt/ subfolder for the .pt files)
--weights is the optional location of pretrained weights in .caffemodel
--imdb is the full name of the database as specified in the lib/datasets/factory.py file
(nb: dont forget to add the test/train suffix !)
If you're sshing to the server, use screen
to keep process running if connection drops unexpectedly.
Run this line to test your model:
$ cd $PY_FASTER_RCNN/
$ ./tools/test_net.py --gpu 0 --def models/gaze_model_end2end/test.prototxt --net output/faster_rcnn_end2end/train/vgg_cnn_m_1024_gaze_end2end_iter_20000.caffemodel --imdb gaze_val --cfg experiments/cfgs/faster_rcnn_end2end.yml
In the test.prototxt
and the train.prototxt
files, rename "cls_score"
and "bbox_pred"
to different names. For example,
name: "cls_score" -> name: "cls_score_object"
name: "bbox_pred" -> name: "bbox_pred_object"
If you are using find and replace, include the quotations!
The purpose of first fine-tuning is to get a caffemodel which has two outputs at final fully-connected layers.
./tools/train_net.py --gpu 0 --weights data/faster_rcnn_models/VGG16_faster_rcnn_final.caffemodel --imdb gaze_train --cfg experiments/cfgs/faster_rcnn_end2end.yml --solver models/gaze_model/solver.prototxt --iter 0
Rename the layers back to the original name from step 1.
Before training on your new dataset, you may need to check $FRCN/data/cache to remove caches if necessary. Caches stores information of previously trained dataset. It may cause problem while training.
This fine-tuning should train models for our final use. The pre-trained model in this stage is the model we saved in stage 2.
./tools/train_net.py --gpu 0 --weights output/faster_rcnn_end2end/train/gaze_faster_rcnn_iter_0.caffemodel --imdb gaze_train --cfg experiments/cfgs/faster_rcnn_end2end.yml --solver models/gaze_vgg16/solver.prototxt --iter 10000
Use https://huangying-zhan.github.io/2016/09/22/detection-faster-rcnn.html
as reference