Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training
By Hongkai Zhang, Hong Chang, Bingpeng Ma, Naiyan Wang, Xilin Chen.
This project is based on maskrcnn-benchmark.
[2020.7] Dynamic R-CNN is officially included in MMDetection V2.2, many thanks to @xvjiarui and @hellock for migrating the code.
Abstract
Although two-stage object detectors have continuously advanced the state-of-the-art performance in recent years, the training process itself is far from crystal. In this work, we first point out the inconsistency problem between the fixed network settings and the dynamic training procedure, which greatly affects the performance. For example, the fixed label assignment strategy and regression loss function cannot fit the distribution change of proposals and are harmful to training high quality detectors. Then, we propose Dynamic R-CNN to adjust the label assignment criteria (IoU threshold) and the shape of regression loss function (parameters of SmoothL1 Loss) automatically based on the statistics of proposals during training. This dynamic design makes better use of the training samples and pushes the detector to fit more high quality samples. Specifically, our method improves upon ResNet-50-FPN baseline with 1.9% AP and 5.5% AP90 on the MS COCO dataset with no extra overhead. For more details, please refer to our paper.
Models
Model | Multi-scale training | AP (minival) | AP (test-dev) | Trained model |
---|---|---|---|---|
Dynamic_RCNN_r50_fpn_1x | No | 38.9 | 39.1 | Google Drive |
Dynamic_RCNN_r50_fpn_2x | No | 39.9 | 39.9 | Google Drive |
Dynamic_RCNN_r101_fpn_1x | No | 41.0 | 41.2 | Google Drive |
Dynamic_RCNN_r101_fpn_2x | No | 41.8 | 42.0 | Google Drive |
Dynamic_RCNN_r101_fpn_2x | Yes | 44.4 | 44.7 | Google Drive |
Dynamic_RCNN_r101_dcnv2_fpn_2x | Yes | 46.7 | 46.9 | Google Drive |
1x
and2x
mean the model is trained for 90K and 180K iterations, respectively.- For
Multi-scale training
, the shorter side of images is randomly chosen from (400, 600, 800, 1000, 1200), and the longer side is 1400. We also extend the training time by1.5x
under this setting. dcnv2
denotes deformable convolutional networks v2. We follow the same setting as maskrcnn-benchmark. Note that the result of this version is slightly lower than that of mmdetection.- All results in the table are obtained using a single model with no extra testing tricks. Additionally, adopting multi-scale testing on model
Dynamic_RCNN_r101_dcnv2_fpn_2x
achieves 49.2% in AP on COCO test-dev. Please setTEST.BBOX_AUG.ENABLED = True
in theconfig.py
to enable multi-scale testing. Here we use five scales with shorter sides (800, 1000, 1200, 1400, 1600) and the longer side is 2000 pixels. Note that Dynamic R-CNN*(50.1% AP) in Table 9 is implemented using MMDetection v1.1, please refer to this link. - If you want to test the model provided by us, please refer to Testing.
Getting started
Installation
0. Requirements
- pytorch (v1.0.1.post2, other version have not been tested)
- torchvision (v0.2.2.post3, other version have not been tested)
- cocoapi
- matplotlib
- tqdm
- cython
- easydict
- opencv
Anaconda3 is recommended here since it integrates many useful packages. Please make sure that your conda is setup properly with the right environment. Then install pytorch
and torchvision
manually as follows:
pip install torch==1.0.1.post2
pip install torchvision==0.2.2.post3
Other dependencies will be installed during setup
.
1. Clone this repo
git clone https://github.com/hkzhang95/DynamicRCNN.git
2. Compile kernels
Please make sure your CUDA
is successfully installed and be added to the PATH
. I only test CUDA-9.0
for my experiments.
cd ${DynamicRCNN_ROOT}
python setup.py build develop
3. Prepare data and output directory
cd ${DynamicRCNN_ROOT}
mkdir data
mkdir output
Prepare data and pretrained models:
Then organize them as follows:
DynamicRCNN
├── dynamic_rcnn
├── models
├── output
├── data
│ ├── basemodels/R-50.pkl
│ ├── coco
│ │ ├── annotations
│ │ ├── train2017(2014)
│ │ ├── val2017(2014)
Training
We use torch.distributed.launch
in order to launch multi-gpu training.
cd models/zhanghongkai/dynamic_rcnn/coco/dynamic_rcnn_r50_fpn_1x
python -m torch.distributed.launch --nproc_per_node=8 train.py
Outputs
Training and testing logs will be saved automatically in the output
directory following the same path as in models
.
For example, the experiment directory and log directory are formed as follows:
models/zhanghongkai/dynamic_rcnn/coco/dynamic_rcnn_r50_fpn_1x
output/zhanghongkai/dynamic_rcnn/coco/dynamic_rcnn_r50_fpn_1x
And you can link the log
to your experiment directory by this script in the experiment directory:
python config.py -log
Testing
Using -i
to specify iteration for testing, default is the latest model.
# for regular testing and evaluation
python -m torch.distributed.launch --nproc_per_node=8 test.py
# for specified iteration
python -m torch.distributed.launch --nproc_per_node=8 test.py -i $iteration_number
If you want to test our provided model, just download the model, move it to the corresponding log directory and create a symbolic link like follows:
# example for Dynamic_RCNN_r50_fpn_1x
cd models/zhanghongkai/dynamic_rcnn/coco/dynamic_rcnn_r50_fpn_1x
python config.py -log
realpath log | xargs mkdir
mkdir -p log/checkpoints
mv path/to/the/model log/checkpoints
realpath log/checkpoints/dynamic_rcnn_r50_fpn_1x_test_model_0090000.pth last_checkpoint | xargs ln -s
Then you can follow the regular testing and evaluation process.
Third-party resources
- MxNet implementation: SimpleDet
Acknowledgement
Citations
Please consider citing our paper in your publications if it helps your research:
@article{DynamicRCNN,
author = {Hongkai Zhang and Hong Chang and Bingpeng Ma and Naiyan Wang and Xilin Chen},
title = {Dynamic {R-CNN}: Towards High Quality Object Detection via Dynamic Training},
journal = {arXiv preprint arXiv:2004.06002},
year = {2020}
}