Forked from Relation Networks for Object Detection with major contributors Dazhi Cheng, Jiayuan Gu, Han Hu and Zheng Zhang.
Joined with Flow-Guided Feature Aggregation (FGFA) with major contributors Yuqing Zhu, Shuhao Fu, and Xizhou Zhu, when they are interns at MSRA.
And with Deformable ConvNets with major contributors Yuwen Xiong, Haozhi Qi, Guodong Zhang, Yi Li, Jifeng Dai, Bin Xiao, Han Hu and Yichen Wei.
Relation Networks for Object Detection is described in an CVPR 2018 oral paper.
Flow-Guided Feature Aggregation (FGFA) is described in an ICCV 2017 paper.
Deformable ConvNets is described in an ICCV 2017 oral paper.
From the original Relation Networks README
This is an official implementation for Relation Networks for Object Detection based on MXNet. It is worth noting that:
- This repository is tested on official MXNet v1.1.0@(commit 629bb6). You should be able to use it with any version of MXNET that contains required operators like Deformable Convolution.
- We trained our model based on the ImageNet pre-trained ResNet-v1-101 using a model converter. The converted model produces slightly lower accuracy (Top-1 Error on ImageNet val: 24.0% v.s. 23.6%).
- This repository is based on Deformable ConvNets.
Our modified code is tested on Ubuntu 16.04 with CUDA 9.1 and MXNet 1.2.1
© Microsoft, 2018. Licensed under an MIT license.
If you find Relation Networks for Object Detection useful in your research, please consider citing:
@article{hu2017relation,
title={Relation Networks for Object Detection},
author={Hu, Han and Gu, Jiayuan and Zhang, Zheng and Dai, Jifeng and Wei, Yichen},
journal={arXiv preprint arXiv:1711.11575},
year={2017}
}
If you find Flow-Guided Feature Aggregation useful in your research, please consider citing:
@inproceedings{zhu17fgfa,
Author = {Xizhou Zhu, Yujie Wang, Jifeng Dai, Lu Yuan, Yichen Wei},
Title = {Flow-Guided Feature Aggregation for Video Object Detection},
Conference = {ICCV},
Year = {2017}
}
@inproceedings{dai16rfcn,
Author = {Jifeng Dai, Yi Li, Kaiming He, Jian Sun},
Title = {{R-FCN}: Object Detection via Region-based Fully Convolutional Networks},
Conference = {NIPS},
Year = {2016}
}
If you find Deformable ConvNets useful in your research, please consider citing:
@article{dai17dcn,
Author = {Jifeng Dai, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, Yichen Wei},
Title = {Deformable Convolutional Networks},
Journal = {arXiv preprint arXiv:1703.06211},
Year = {2017}
}
@inproceedings{dai16rfcn,
Author = {Jifeng Dai, Yi Li, Kaiming He, Jian Sun},
Title = {{R-FCN}: Object Detection via Region-based Fully Convolutional Networks},
Conference = {NIPS},
Year = {2016}
}
training data | testing data | mAP | mAP@0.5 | mAP@0.75 | mAP@S | mAP@M | mAP@L | Inference Time | Post Processing Time | |
---|---|---|---|---|---|---|---|---|---|---|
2FC + nms(0.5) ResNet-101 |
coco trainval35k | coco minival | 31.8 | 53.9 | 32.2 | 10.5 | 35.2 | 51.5 | 0.168s | 0.025s |
2FC + softnms(0.6) ResNet-101 |
coco trainval35k | coco minival | 32.3 | 52.8 | 34.1 | 11.1 | 35.9 | 51.8 | 0.200s | 0.060s |
2FC + Relation Module + softnms ResNet-101 |
coco trainval35k | coco minival | 34.7 | 55.3 | 37.2 | 13.7 | 38.8 | 53.6 | 0.211s | 0.059s |
2FC + Learn NMS ResNet-101 |
coco trainval35k | coco minival | 32.6 | 51.8 | 35.0 | 11.8 | 36.6 | 52.1 | 0.162s | 0.020s |
2FC + Relation Module + Learn NMS(e2e) ResNet-101 |
coco trainval35k | coco minival | 35.2 | 55.5 | 38.0 | 15.2 | 39.2 | 54.1 | 0.175s | 0.022s |
training data | testing data | mAP | mAP@0.5 | mAP@0.75 | mAP@S | mAP@M | mAP@L | Inference Time | NMS Time | |
---|---|---|---|---|---|---|---|---|---|---|
2FC + nms(0.5) ResNet-101 |
coco trainval35k | coco minival | 37.2 | 58.1 | 40.0 | 16.4 | 41.3 | 55.5 | 0.180s | 0.022s |
2FC + softnms(0.6) ResNet-101 |
coco trainval35k | coco minival | 37.5 | 57.3 | 41.0 | 16.6 | 41.7 | 55.8 | 0.208s | 0.052s |
2FC + Relation Module + Learn NMS(e2e) ResNet-101 |
coco trainval35k | coco minival | 38.4 | 57.6 | 41.6 | 18.2 | 43.1 | 56.6 | 0.188s | 0.023s |
training data | testing data | mAP | mAP@0.5 | mAP@0.75 | mAP@S | mAP@M | mAP@L | Inference Time | NMS Time | |
---|---|---|---|---|---|---|---|---|---|---|
2FC + nms(0.5) ResNet-101 |
coco trainval35k | coco minival | 36.6 | 59.3 | 39.3 | 20.3 | 40.5 | 49.4 | 0.196s | 0.037s |
2FC + softnms(0.6) ResNet-101 |
coco trainval35k | coco minival | 36.8 | 57.8 | 40.7 | 20.4 | 40.8 | 49.7 | 0.323s | 0.167s |
2FC + Relation Module + Learn NMS(e2e) ResNet-101 |
coco trainval35k | coco minival | 38.6 | 59.9 | 43.0 | 22.1 | 42.3 | 52.8 | 0.232s | 0.022s |
Running time is counted on a single Maxwell Titan X GPU (mini-batch size is 1 in inference).
-
MXNet from the offical repository. We tested our code on MXNet 1.2.1. Due to the rapid development of MXNet, it is recommended to checkout this version if you encounter any issues.
-
Python 2.7. We recommend using Anaconda2 as it already includes many common packages. We do not support Python 3 yet, if you want to use Python 3 you need to modify the code to make it work.
-
The following Python packages:
Cython
EasyDict
mxnet-cu91 # changed from mxnet-cu80 used in relation networks code
opencv-python
Any NVIDIA GPUs with at least 6GB memory should be OK.
- Clone the repository.
git clone https://github.com/HaydenFaulkner/Relation-Networks-for-Object-Detection-Video.git
cd Relation-Networks-for-Object-Detection-Video
-
Run
sh ./init.sh
. The scripts will build cython module automatically and create some folders. -
Install MXNet:
Quick start
3.1 Install MXNet and all dependencies by
pip install -r requirements.txt
If there is no other error message, MXNet should be installed successfully.
If you get an error about not finding libcudart.so
even after having your environment variables set, try running (with the correct paths):
sudo sh -c "echo '/usr/local/cuda/lib64\n/usr/local/cuda/lib' >> /etc/ld.so.conf.d/nvidia.conf"
sudo ldconfig
Build from source (alternative way)
3.2 Clone MXNet v1.1.0 by
git clone -b v1.1.0 --recursive https://github.com/apache/incubator-mxnet.git
3.3 Compile MXNet
cd ${MXNET_ROOT}
make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1
3.4 Install the MXNet Python binding by
Note: If you will actively switch between different versions of MXNet, please follow 3.5 instead of 3.4
cd python
sudo python setup.py install
3.5 For advanced users, you may put your Python packge into ./external/mxnet/$(YOUR_MXNET_PACKAGE)/mxnet
, and modify MXNET_VERSION
in ./experiments/relation_rcnn/cfgs/*.yaml
to $(YOUR_MXNET_PACKAGE)
. Thus you can switch among different versions of MXNet quickly.
- Make sure the correct cuda is on your
LD_LIBRARY_PATH
-
Please download the datasets, and use the following structure:
1.1 MSCOCO 2017 (18 + 1 + 6 + .241 GB)
./data/coco/
1.2 ImageNetDET 2015 (47 + .015 + .0014 GB) (unchanged from 2014 data), ImageNetLOC 2015 (160 GB) (from Kaggle) and ImageNetVID 2015 (86GB)
./data/ILSVRC2015/ ./data/ILSVRC2015/Annotations/DET ./data/ILSVRC2015/Annotations/LOC ./data/ILSVRC2015/Annotations/VID ./data/ILSVRC2015/Data/DET ./data/ILSVRC2015/Data/LOC ./data/ILSVRC2015/Data/VID ./data/ILSVRC2015/ImageSets
1.3 PascalVOC 2007 (.439 GB) and PascalVOC 2012 (2 GB)
./data/VOCdevkit/VOC2007/ ./data/VOCdevkit/VOC2012/
-
Please download ImageNet-pretrained ResNet-v1-101 backbone model and Faster RCNN ResNet-v1-101 model manually from Relation Backbone OneDrive, and put it under folder
./model/relation/pretrained_model
. Make sure it looks like this:./models/backbones/resnet_v1_101-0000.params
We use a pretrained Faster RCNN and fix its params when training Faster RCNN with Learn NMS head. If you are trying to conduct such experiments, please also include the pretrained Faster RCNN model from OneDrive, making sure it looks like this:
./models/relation/pretrained/coco_resnet_v1_101_rcnn-0008.params
-
For FPN related experiments, we use proposals generated by a pretrained RPN to speed up our experiments. Please download the proposals from Proposals OneDrive and put it under folder
./proposal/resnet_v1_101_fpn/rpn_data
. Make sure it looks like this:./proposal/resnet_v1_101_fpn/rpn_data/COCO_minival2014_rpn.pkl ./proposal/resnet_v1_101_fpn/rpn_data/COCO_train2014_rpn.pkl ./proposal/resnet_v1_101_fpn/rpn_data/COCO_valminusminival2014_rpn.pkl
-
Download the FGFA Flying-Chairs pre-trained backbone FlowNet model from FGFA Backbone OneDrive, and make sure it looks like this:
./models/backbones/flownet-0000.params
You can delete the
resnet_v1_101-0000.params
downloaded here as it is a duplicate that we downloaded in step 2.
Provided are trained models for each of the problems.
- To try out our pre-trained relation network models, please download manually from Relation PreTrained OneDrive, and make sure it looks like this:
./models/relation/pretrained/rcnn/coco/resnet_v1_101_coco_trainvalminus_rcnn_end2end_8epoch/train2014_valminusminival2014/rcnn_coco-0008.params ./models/relation/pretrained/rcnn/coco/resnet_v1_101_coco_trainvalminus_rcnn_end2end_relation_8epoch/train2014_valminusminival2014/rcnn_coco-0008.params ./models/relation/pretrained/rcnn/coco/resnet_v1_101_coco_trainvalminus_rcnn_end2end_learn_nms_3epoch/train2014_valminusminival2014/rcnn_coco-0003.params ./models/relation/pretrained/rcnn/coco/resnet_v1_101_coco_trainvalminus_rcnn_end2end_relation_learn_nms_8epoch/train2014_valminusminival2014/rcnn_coco-0008.params ./models/relation/pretrained/rcnn/coco/resnet_v1_101_coco_trainvalminus_rcnn_dcn_end2end_8epoch/train2014_valminusminival2014/rcnn_coco-0008.params ./models/relation/pretrained/rcnn/coco/resnet_v1_101_coco_trainvalminus_rcnn_dcn_end2end_relation_learn_nms_8epoch/train2014_valminusminival2014/rcnn_coco-0008.params ./models/relation/pretrained/rcnn/coco/resnet_v1_101_coco_trainvalminus_rcnn_fpn_8epoch/train2014_valminusminival2014/rcnn_fpn_coco-0008.params ./models/relation/pretrained/rcnn/coco/resnet_v1_101_coco_trainvalminus_rcnn_fpn_relation_learn_nms_8epoch/train2014_valminusminival2014/rcnn_fpn_coco-0008.params
- To run the Faster RCNN with Relation Module and Learn NMS model, run
If you want to try other models, just change the config files. There are ten config files in
python experiments/relation_rcnn/rcnn_test.py --cfg experiments/relation_rcnn/cfgs/resnet_v1_101_coco_trainvalminus_rcnn_end2end_relation_learn_nms_8epoch.yaml --ignore_cache
./experiments/relation_rcnn/cfg
folder, eight of which are provided with pretrained models.
-
Download the trained FGFA model (on ImageNet DET + VID train) from FGFA PreTrained OneDrive, and make sure it looks like this:
./models/fgfa/pretrained/rfcn_fgfa_flownet_vid-0000.params
TODO: put this into output directory
-
Run
python ./fgfa_rfcn/demo.py
-
To use the demo with the pre-trained deformable models, please download manually from Deformable PreTrained OneDrive or BaiduYun, and put it under folder
model/
.Make sure it looks like this:
./models/deformable/pretrained/rfcn_dcn_coco-0000.params ./models/deformable/pretrained/rfcn_coco-0000.params ./models/deformable/pretrained/fpn_dcn_coco-0000.params ./models/deformable/pretrained/fpn_coco-0000.params ./models/deformable/pretrained/rcnn_dcn_coco-0000.params ./models/deformable/pretrained/rcnn_coco-0000.params ./models/deformable/pretrained/deeplab_dcn_cityscapes-0000.params ./models/deformable/pretrained/deeplab_cityscapes-0000.params ./models/deformable/pretrained/deform_conv-0000.params ./models/deformable/pretrained/deform_psroi-0000.params
-
To run the R-FCN demo, run
python ./rfcn/demo.py --rfcn_only
-
To visualize the offset of deformable convolution and deformable psroipooling, run
python ./rfcn/deform_conv_demo.py python ./rfcn/deform_psroi_demo.py
-
All of the experiment settings (GPU #, dataset, etc.) are kept in yaml config files at folder
./experiments/../cfgs
. -
To perform experiments, run the python scripts with the corresponding config file as input. For example
2.1 to train and test Faster RCNN with Relation Module and Learn NMS(e2e), use the following command:
python experiments/relation_rcnn/rcnn_end2end_train_test.py --cfg experiments/relation_rcnn/cfgs/resnet_v1_101_coco_trainvalminus_rcnn_end2end_relation_learn_nms_8epoch.yaml
A cache folder would be created automatically to save the model and the log under
models/relation/output/rcnn/
.The rcnn_end2end_train_test.py script is for Faster RCNN and Deformable Faster RCNN experiments that train RPN together with RCNN. To train and test FPN which use previously generated proposals, use the following command:
python experiments/relation_rcnn/rcnn_train_test.py --cfg experiments/relation_rcnn/cfgs/resnet_v1_101_coco_trainvalminus_fpn_relation_learn_nms_8epoch.yaml
2.2 To train and test FGFA with R-FCN, use the following command
python experiments/fgfa_rfcn/fgfa_rfcn_end2end_train_test.py --cfg experiments/fgfa_rfcn/cfgs/resnet_v1_101_flownet_imagenet_vid_rfcn_end2end_ohem.yaml
A cache folder would be created automatically to save the model and the log under
models/fgfa/output/fgfa_rfcn/imagenet_vid/
.2.3 To perform experiments with just deformable nets, run the python scripts with the corresponding config file as input. For example, to train and test deformable convnets on COCO with ResNet-v1-101, use the following command
python experiments/rfcn/rfcn_end2end_train_test.py --cfg experiments/rfcn/cfgs/resnet_v1_101_coco_trainval_rfcn_dcn_end2end_ohem.yaml
A cache folder would be created automatically to save the model and the log under
models/deformable/output/rfcn_dcn_coco/
. -
Please find more details in config files and in the code.
Q: I encounter segment fault
at the beginning.
A: A compatibility issue has been identified between MXNet and opencv-python 3.0+. We suggest that you always import cv2
first before import mxnet
in the entry script.