Using Multi-label Classification to Improve Object Detection
MONet is modified based on py-R-FCN-priv, thanks for soeaver's job.
The official R-FCN code (written in MATLAB) is available here.
The official R-FCN code (written in PYTHON) is available here.
-
Clone the MONet repository
git clone https://github.com/GT9505/MONet
We'll call the directory that you cloned MONet into
MONET_ROOT
-
Build the Cython modules
cd $MONET_ROOT/lib make
-
Build Caffe and pycaffe
cd $MONET_ROOT/caffe # Now follow the Caffe installation instructions here: # http://caffe.berkeleyvision.org/installation.html # cp Makefile.config.example Makefile.config # If you're experienced with Caffe and have all of the requirements installed # and your Makefile.config in place, then simply do: make all -j && make pycaffe -j
Note: Caffe must be built with support for Python layers!
# In your Makefile.config, make sure to have this line uncommented WITH_PYTHON_LAYER := 1 # Unrelatedly, it's also recommended that you use CUDNN USE_CUDNN := 1 # NCCL (https://github.com/NVIDIA/nccl) is necessary for multi-GPU training with python layer USE_NCCL := 1
Please follow the official py-R-FCN code to preparation training set and testing set
Please download backbone network ResNet-101 in here
The usage is same as py-R-FCN-priv
cd $MONet_ROOT
./experments/scripts/monet_end2end_ohem_multi_gpu.sh 0 pascal_voc
Using the default hyperparameters and iterations, you can achieve a mAP around 83.0% (no multi-scale training). The model with 83.0% mAP. Using multi-scale training can further improve the mAP to 83.6%.
MONet is released under the MIT License (refer to the LICENSE file for details).
If you find MONet useful in your research, please consider citing:
@article{gong18monet,
Author = {Tao Gong, Bin Liu, Qi Chu, Nenghai Yu},
Title = {Using Multi-label Classification to Improve Object Detection},
Journal = {Submitted to Neurocomputing},
Year = {2018.09}
}