/Stable-DINO

[ICCV 2023] Official implementation of the paper "Detection Transformer with Stable Matching"

Primary LanguagePythonApache License 2.0Apache-2.0

🐲 Stable-DINO: Detection Transformer with Stable Matching

PWC PWC

IDEA-CVR, IDEA-Research

Shilong Liu*, Tianhe Ren*, Jiayu Chen*, Zhaoyang Zeng, Hao Zhang, Feng Li, Hongyang Li, Jun Huang, Hang Su, Jun Zhu, Lei Zhang📧.

(*) equal contribution, (📧) corresponding author.

[Stable-DINO Paper] [Focal-Stable-DINO Report] [BibTex] [Code in detrex]

✨ News

  • 14 Jul, 2023: Stable-DINO is accepted to ICCV 2023!
  • 26 Apr, 2023: By combining with FocalNet-Huge backbone, Focal-Stable-DINO achieves 64.6 AP on COCO val2017 and 64.8 AP on COCO test-dev without any test time augmentation! Check our Technical Report for more details.
  • 12 Apr, 2023: Preprint our paper on ArXiv!

💡 Highlight

  • High performance. Maybe the strongest object detector. 63.8 AP on COCO with Swin-Large backbones (only 218M parameters).
  • Scalable. Combining with larger backbone FocalNet-Huge (only 689M parameters), Stable-DINO still improves the performance to 64.6 AP on COCO val2017 and 64.8 AP on COCO test-dev without any test time augmentation.
  • Easy to use. Only a few lines of code to be modified on DINO.
  • Lightweight. Nearly no extra cost during training and inference compared with DINO.
  • Generalization. Easy to combine with the existing DETR variants and boost the performance.

Performance

📖 Methods:

stable matching

memory fusion

🍟 Results:

  • ResNet-50 Backbone R50

  • Swin-L Backbone swinl

  • Compare with SOTA methods sota

  • Stable-MaskDINO smd

Run

Our code is implemented on detrex.

  1. Install detrex and data preparation

Please follow the detrex instruction for installation and data preparation.

  1. Training scripts

We provide a training example of Stable DINO R50. Refer to the detrex doc for more details

CUDA_VISIBLE_DEVICES=0 \
python tools/train_net.py \
    --config-file projects/stabledino/configs/stabledino_r50_4scale_12ep.py \
    --num-gpus 1 \
    dataloader.train.total_batch_size=4 \
    train.output_dir="./output/stabledino_r50_4scale_12ep" \
    train.test_with_nms=0.80 

🍗 Related Projects:

🥑 Citing Stable-DINO

If you use Stable-DINO in your research or wish to refer to the baseline results published here, please use the following BibTeX entry.

@misc{liu2023detection,
      title={Detection Transformer with Stable Matching}, 
      author={Shilong Liu and Tianhe Ren and Jiayu Chen and Zhaoyang Zeng and Hao Zhang and Feng Li and Hongyang Li and Jun Huang and Hang Su and Jun Zhu and Lei Zhang},
      year={2023},
      eprint={2304.04742},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
@misc{ren2023strong,
      title={A Strong and Reproducible Object Detector with Only Public Datasets}, 
      author={Tianhe Ren and Jianwei Yang and Shilong Liu and Ailing Zeng and Feng Li and Hao Zhang and Hongyang Li and Zhaoyang Zeng and Lei Zhang},
      year={2023},
      eprint={2304.13027},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}