This branch contains fixes for the Detectron code that allows aplication on domains with many small objects, specifically it was designed for traffic sign detection from the "Deep Learning for Large-Scale Traffic-Sign Detection and Recognition" ITS 2019 journal paper.
The following changes are included:
- integrated Online Hard Negative Minning (OHEM)
- modifed selection of training ROIs to cover small and large regions evenly
- added weights to loss of background samples during training (weight of 0.01 for RPN and 0.1 for classification)
YAML definition files of detectron models for the DFG-dataset are available below:
Each zip contains models based on ResNet101_FPN and ResNet50_FPN that have enabled OHEM (OHEM: True
), even selection of small and large ROIs (RPN_EVENLY_SELECT_POS_ROIS: True
) and weighting of pos/neg classes (RPN_SIZE_WEIGHTED_LOSS: True
and CLS_SIZE_WEIGHTED_LOSS: True
).
You can download weights trained on DFG-Dataset for the upper model in:
Note: Due to slight update of the DFG dataset the resulty may vary from the ITS 2019 paper
Our changes to the code require custom WeightedSigmoidCrossEntropyLoss
operation (for RPN_SIZE_WEIGHTED_LOSS
and CLS_SIZE_WEIGHTED_LOSS
options) which is implemented in the DETECTRON_PATH/caffe2-modules/
folder. All files in DETECTRON_PATH/caffe2-modules/*
need to be coppied into your caffe2 source (CAFFE2_SRC_PATH/modules/detectron
).
export DETECTRON_PATH=/path/to/detectron
export CAFFE2_SRC_PATH=/path/to/caffe2_source
cp $DETECTRON_PATH/caffe2-modules/* $CAFFE2_SRC_PATH/modules/detectron
After copying caffe2-modules/*
caffe2 proceed with the instalation instructions for caffe2 and Detectron in INSTALL.md
Please cite our ITS 2019 paper when using modifications for Detectron from this repository or for the DFG dataset:
@article{Tabernik2019ITS,
author = {Tabernik, Domen and Sko{\v{c}}aj, Danijel},
journal = {IEEE Transactions on Intelligent Transportation Systems},
title = {{Deep Learning for Large-Scale Traffic-Sign Detection and Recognition}},
year = {2019},
doi={10.1109/TITS.2019.2913588},
ISSN={1524-9050}
}
Detectron is Facebook AI Research's software system that implements state-of-the-art object detection algorithms, including Mask R-CNN. It is written in Python and powered by the Caffe2 deep learning framework.
At FAIR, Detectron has enabled numerous research projects, including: Feature Pyramid Networks for Object Detection, Mask R-CNN, Detecting and Recognizing Human-Object Interactions, Focal Loss for Dense Object Detection, Non-local Neural Networks, Learning to Segment Every Thing, and Data Distillation: Towards Omni-Supervised Learning.
The goal of Detectron is to provide a high-quality, high-performance codebase for object detection research. It is designed to be flexible in order to support rapid implementation and evaluation of novel research. Detectron includes implementations of the following object detection algorithms:
- Mask R-CNN -- Marr Prize at ICCV 2017
- RetinaNet -- Best Student Paper Award at ICCV 2017
- Faster R-CNN
- RPN
- Fast R-CNN
- R-FCN
using the following backbone network architectures:
- ResNeXt{50,101,152}
- ResNet{50,101,152}
- Feature Pyramid Networks (with ResNet/ResNeXt)
- VGG16
Additional backbone architectures may be easily implemented. For more details about these models, please see References below.
Detectron is released under the Apache 2.0 license. See the NOTICE file for additional details.
If you use Detectron in your research or wish to refer to the baseline results published in the Model Zoo, please use the following BibTeX entry.
@misc{Detectron2018,
author = {Ross Girshick and Ilija Radosavovic and Georgia Gkioxari and
Piotr Doll\'{a}r and Kaiming He},
title = {Detectron},
howpublished = {\url{https://github.com/facebookresearch/detectron}},
year = {2018}
}
We provide a large set of baseline results and trained models available for download in the Detectron Model Zoo.
Please find installation instructions for Caffe2 and Detectron in INSTALL.md
.
After installation, please see GETTING_STARTED.md
for brief tutorials covering inference and training with Detectron.
To start, please check the troubleshooting section of our installation instructions as well as our FAQ. If you couldn't find help there, try searching our GitHub issues. We intend the issues page to be a forum in which the community collectively troubleshoots problems.
If bugs are found, we appreciate pull requests (including adding Q&A's to FAQ.md
and improving our installation instructions and troubleshooting documents). Please see CONTRIBUTING.md for more information about contributing to Detectron.
- Data Distillation: Towards Omni-Supervised Learning. Ilija Radosavovic, Piotr Dollár, Ross Girshick, Georgia Gkioxari, and Kaiming He. Tech report, arXiv, Dec. 2017.
- Learning to Segment Every Thing. Ronghang Hu, Piotr Dollár, Kaiming He, Trevor Darrell, and Ross Girshick. Tech report, arXiv, Nov. 2017.
- Non-Local Neural Networks. Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. Tech report, arXiv, Nov. 2017.
- Mask R-CNN. Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. IEEE International Conference on Computer Vision (ICCV), 2017.
- Focal Loss for Dense Object Detection. Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. IEEE International Conference on Computer Vision (ICCV), 2017.
- Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour. Priya Goyal, Piotr Dollár, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. Tech report, arXiv, June 2017.
- Detecting and Recognizing Human-Object Interactions. Georgia Gkioxari, Ross Girshick, Piotr Dollár, and Kaiming He. Tech report, arXiv, Apr. 2017.
- Feature Pyramid Networks for Object Detection. Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
- Aggregated Residual Transformations for Deep Neural Networks. Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
- R-FCN: Object Detection via Region-based Fully Convolutional Networks. Jifeng Dai, Yi Li, Kaiming He, and Jian Sun. Conference on Neural Information Processing Systems (NIPS), 2016.
- Deep Residual Learning for Image Recognition. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Conference on Neural Information Processing Systems (NIPS), 2015.
- Fast R-CNN. Ross Girshick. IEEE International Conference on Computer Vision (ICCV), 2015.