In this paper, we propose a Centralized Feature Pyramid (CFP) for object detection, which is based on a globally explicit centralized feature regulation. We first propose a spatial explicit visual center scheme, where a lightweight MLP is used to capture the globally long-range dependencies and a parallel learnable visual center mechanism is used to capture the local corner regions of the input images. Based on this, we then propose a globally centralized regulation for the commonly-used feature pyramid in a top-down fashion, where the explicit visual center information obtained from the deepest intra-layer feature is used to regulate frontal shallow features. Compared to the existing feature pyramids, CFP not only has the ability to capture the global long-range dependencies, but also efficiently obtain an all-round yet discriminative feature representation.
Here, we present weights of CFP with YOLOX as the baseline.
| Model | size | mAP(%) | weights |
|---|---|---|---|
| CFP-s | 640 | 41.1 | weight |
| CFP-m | 640 | 46.4 | weight |
| CFP-l | 640 | 49.4 | weight |
git clone git@github.com:QY1994-0919/CFP-main.git
cd CFP-main
pip3 install -v -e . # or python3 setup.py develop
cd CFP-main
ln -s /path/to/your/COCO ./datasets/COCO
python -m cfp.tools.train -f cfp-s -d 2 -b 16 --fp16 -o [--cache]
python -m cfp.tools.train -f cfp-m -d 2 -b 16 --fp16 -o [--cache]
python -m cfp.tools.train -f cfp-l -d 2 -b 16 --fp16 -o [--cache]
python -m cfp.tools.eval -n cfp-s -c cfp_s.pth -b 16 -d 2 --conf 0.001 [--fp16] [--fuse]
python -m cfp.tools.eval -n cfp-m -c cfp_s.pth -b 16 -d 2 --conf 0.001 [--fp16] [--fuse]
python -m cfp.tools.eval -n cfp-l -c cfp_s.pth -b 16 -d 2 --conf 0.001 [--fp16] [--fuse]
Thanks YOLOX team for the wonderful open source project!
If you find CFP useful in your research, please consider citing:

