The codes are used for implementing FCOS for object detection, described in:
FCOS: Fully Convolutional One-Stage Object Detection,
Tian, Zhi, Chunhua Shen, Hao Chen, and Tong He,
arXiv preprint arXiv:1904.01355 (2019).
The full paper is available at: https://arxiv.org/abs/1904.01355.
- Totally anchor-free: FCOS completely avoids the complicated computation related to anchor boxes and all hyper-parameters of anchor boxes.
- Memory-efficient: FCOS uses 2x less training memory footprint than its anchor-based counterpart RetinaNet.
- Better performance: Compared to RetinaNet, FCOS has better performance under exactly the same training and testing settings.
- State-of-the-art performance: Without bells and whistles, FCOS achieves state-of-the-art performances. It achieves 41.0% (ResNet-101-FPN) and 42.1% (ResNeXt-32x8d-101) in AP on coco test-dev.
- Faster: FCOS enjoys faster training and inference speed than RetinaNet.
We use 8 Nvidia V100 GPUs.
But 4 1080Ti GPUs can also train a fully-fledged ResNet-50-FPN based FCOS since FCOS is memory-efficient.
This FCOS implementation is based on maskrcnn-benchmark, so its installation is the same as original maskrcnn-benchmark.
Please check INSTALL.md for installation instructions. You may also want to see the original README.md of maskrcnn-benchmark.
The inference command line on coco minival split:
python tools/test_net.py \
--config-file configs/fcos/fcos_R_50_FPN_1x.yaml \
MODEL.WEIGHT models/FCOS_R_50_FPN_1x.pth \
TEST.IMS_PER_BATCH 4
Please note that:
- If your model has other name, please replace
models/FCOS_R_50_FPN_1x.pth
with the name. - If you enounter out-of-memory error, please try to reduce
TEST.IMS_PER_BATCH
to 1. - If you want to evaluate another model, please change
--config-file
to its config file (in configs/fcos) andMODEL.WEIGHT
to its weights file.
For your convenience, we provide the following trained models (more models are coming soon).
Model | Total training mem (GB) | Multi-scale training | Testing time / im | AP (minival) | AP (test-dev) | Link |
---|---|---|---|---|---|---|
FCOS_R_50_FPN_1x | 29.3 | No | 71ms | 36.6 | 37.0 | download |
FCOS_R_101_FPN_2x | 44.1 | Yes | 74ms | 40.9 | 41.0 | download |
FCOS_X_101_32x8d_FPN_2x | 72.9 | Yes | 122ms | 42.0 | 42.1 | download |
[1] 1x means the model is trained for 90K iterations.
[2] 2x means the model is trained for 180K iterations.
[3] We report total training memory footprint on all GPUs instead of the memory footprint per GPU as in maskrcnn-benchmark.
The following command line will train FCOS_R_50_FPN_1x on 8 GPUs with Synchronous Stochastic Gradient Descent (SGD):
python -m torch.distributed.launch \
--nproc_per_node=8 \
--master_port=$((RANDOM + 10000)) \
tools/train_net.py \
--skip-test \
--config-file configs/fcos/fcos_R_50_FPN_1x.yaml \
DATALOADER.NUM_WORKERS 2 \
OUTPUT_DIR training_dir/fcos_R_50_FPN_1x
Note that:
- If you want to use fewer GPUs, please reduce
--nproc_per_node
. The total batch size does not depends onnproc_per_node
. If you want to change the total batch size, please changeSOLVER.IMS_PER_BATCH
in configs/fcos/fcos_R_50_FPN_1x.yaml. - The models will be saved into
OUTPUT_DIR
. - If you want to train FCOS with other backbones, please change
--config-file
. - Sometimes you may encounter a deadlock with 100% GPUs' usage, which might be a problem of NCCL. Please try
export NCCL_P2P_DISABLE=1
before running the training command line.
Any pull requests or issues are weclome.
Please consider citing our paper in your publications if the project helps your research. The following is a BibTeX reference.
@article{tian2019fcos,
title={FCOS: Fully Convolutional One-Stage Object Detection},
author={Tian, Zhi and Shen, Chunhua and Chen, Hao and He, Tong},
journal={arXiv preprint arXiv:1904.01355},
year={2019}
}
For academic use, this project is licensed under the 2-clause BSD License - see the LICENSE file for details. For commercial use, please contact the authors.