In this project we reproduce YOLOF, a one stage object detector proposed in paper You Only Look One-level Feature, based on PaddleDetection.
YOLOF uses only one feature level to achieve competitive accuracy as those who use multiple feature levels (FPN). It is largely due to the following two novel designs:
Dilated Encoder: use dilated convolution and residual blocks to enlarge receptive fields while still keep multiple receptive fields.
Uniform Assigner: assign k nearest anchors to each GT to ensure GTs have balanced positive samples.
One benefit of using only one feature level is reduce of FLOPs and increase of speed. One data shows YOLOF achieves the same AP on COCO as multi-level-feature RetinaNet while has 57% less FLOPs and 2.5x speed up. We strongly recommend this paper, please visit here to check it out.
There are two official implementations, one is based on Detectron2 and the other is based on cvpods. MMDetection has also implemented YOLOF and included it in their model list. Here we follow both official's Detectron2 version and MMDetection's version.
Note that our implementation is based on PaddleDetection which is built on the deep learning platform of Paddle.
source | backbone | AP | epochs | config | model | train-log | dataset |
---|---|---|---|---|---|---|---|
official | R-50-C5 | 37.7 | 12.3(detail) | config | model[qr6o] | NA | coco2017 |
mmdet | R-50-C5 | 37.5 | 12 | config | model | log | coco2017 |
this | R-50-C5 | 37.5 | 12 | config | model[3z7q] | log | coco2017 |
this_re-train | R-50-C5 | 37.4 | 12 | config | model[6faq] | log | coco2017 |
We train and test our implementation on coco 2017 dataset. The models we provide here are trained on Baidu AIStudio platform. They are trained on 4 V100 GPUs with 8 images per GPU. Data in first two rows above table is directly taken from their official github repos. According to MMDetection's comment, both mmdet and official's version have 0.3 variation of AP. So we re-trained the same config and got 37.4 AP. Thank the team of MMDetection for providing such important information.
Please check out the config for more information on the model.
Please check out the train-log for more information on the loss during training.
The implementation is based on PaddleDetection v2.3, the directory PaddleDetection/
basically contains the whole code base of PaddleDetection. All code related to YOLOF is located at PaddleDetection/ppdet/yolof
.
Requirements:
- python 3.7+
- Paddle v2.2: follow this to install
Clone this repo and install:
git clone https://github.com/thisisi3/Paddle-YOLOF.git
pip install -e Paddle-YOLOF/PaddleDetection -v
Follow this for detailed steps on installation of PaddleDetection and follow this to learn how to use PaddleDetection.
Data preparation:
cd Paddle-YOLOF
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip
mkdir dataset
mkdir dataset/coco
unzip annotations_trainval2017.zip -d dataset/coco
unzip train2017.zip -d dataset/coco
unzip val2017.zip -d dataset/coco
You can also go to aistudio to download coco 2017 if official download is slow.
Download pretrained backbone:
YOLOF uses caffe-style ResNet, it corresponds to variant-a in PaddleDetection. But PaddleDetection does not come with pretrained weight of such variant, so we manually converted the weight and you can download it here[rpsb]. After you download the weight please put it under directory pretrain/
.
Train YOLOF on a single GPU:
python PaddleDetection/tools/train.py -c configs/yolof_r50_c5_1x_coco_8x4GPU.yml --eval
Train YOLOF on 4 GPUs:
python -m paddle.distributed.launch --gpus 0,1,2,3 PaddleDetection/tools/train.py -c configs/yolof_r50_c5_1x_coco_8x4GPU.yml --eval
If you do not want to evaluate AP during training, simply remove the --eval
option.
Eval AP of YOLOF:
python PaddleDetection/tools/eval.py -c configs/yolof_r50_c5_1x_coco_8x4GPU.yml -o weights=path_to_model_final.pdparams
Quick demo:
Thanks to PaddleDetection, we can use the inference script PaddleDetection/tools/infer.py
they provide to visualize detection results of YOLOF, by running the following code:
python PaddleDetection/tools/infer.py -c configs/yolof_r50_c5_1x_coco_8x4GPU.yml -o weights=path_to_model_final.pdparams --infer_img demo/000000185250.jpg --output_dir demo/out/ --draw_threshold 0.5
The test image:
After adding bboxes:
Both images can be found at demo/
.
We would like to thank Baidu AIStudio for providing good quality and good amount of GPU power.
Also thank the following amazing open-source projects: