/RT-DETR

[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥

Primary LanguagePythonApache License 2.0Apache-2.0

English | 简体中文

RT-DETR: DETRs Beat YOLOs on Real-time Object Detection

license prs issues issues arXiv emal


Fig

ppdetr_overview

This is the official implementation of the paper "DETRs Beat YOLOs on Real-time Object Detection".

Updates!!!

  • [2024.01.23] Fix difference on data augmentation with paper in rtdetr_pytorch #84
  • [2023.11.07] Add pytorch rtdetr_r34vd for requests #107, #114
  • [2023.11.05] upgrade the logic of remap_mscoco_category to facilitate training of custom datasets, see detils in Train custom data part. #81
  • [2023.10.23] Add discussion for deployments, supported onnxruntime, TensorRT, openVINO
  • [2023.10.12] Add tuning code for pytorch version, now you can tuning rtdetr based on pretrained weights
  • [2023.09.19] Upload pytorch weights convert from paddle version
  • [2023.08.24] Release rtdetr-18 pretrained models on objects365. 49.2 mAP and 217 FPS
  • [2023.08.22] Upload rtdetr_pytorch source code. Please enjoy it ❤️
  • [2023.08.15] Release rtdetr-r101 pretrained models on objects365. 56.2 mAP and 74 FPS
  • [2023.07.30] Release rtdetr-r50 pretrained models on objects365. 55.3 mAP and 108 FPS
  • [2023.07.28] Fix some bugs, and add some comments. 1, 2
  • [2023.07.13] Upload training logs on coco
  • [2023.05.17] Release RT-DETR-R18, RT-DETR-R34, RT-DETR-R50-m(example for scaled)
  • [2023.04.17] Release RT-DETR-R50, RT-DETR-R101, RT-DETR-L, RT-DETR-X

Implementations

Model Epoch Input shape Dataset $AP^{val}$ $AP^{val}_{50}$ Params(M) FLOPs(G) T4 TensorRT FP16(FPS)
RT-DETR-R18 6x 640 COCO 46.5 63.8 20 60 217
RT-DETR-R34 6x 640 COCO 48.9 66.8 31 92 161
RT-DETR-R50-m 6x 640 COCO 51.3 69.6 36 100 145
RT-DETR-R50 6x 640 COCO 53.1 71.3 42 136 108
RT-DETR-R101 6x 640 COCO 54.3 72.7 76 259 74
RT-DETR-HGNetv2-L 6x 640 COCO 53.0 71.6 32 110 114
RT-DETR-HGNetv2-X 6x 640 COCO 54.8 73.1 67 234 74
RT-DETR-R18 5x 640 COCO + Objects365 49.2 66.6 20 60 217
RT-DETR-R50 2x 640 COCO + Objects365 55.3 73.4 42 136 108
RT-DETR-R101 2x 640 COCO + Objects365 56.2 74.6 76 259 74

Notes:

  • COCO + Objects365 in the table means finetuned model on COCO using pretrained weights trained on Objects365.

Introduction

We propose a Real-Time DEtection TRansformer (RT-DETR, aka RTDETR), the first real-time end-to-end object detector to our best knowledge. Our RT-DETR-L achieves 53.0% AP on COCO val2017 and 114 FPS on T4 GPU, while RT-DETR-X achieves 54.8% AP and 74 FPS, outperforming all YOLO detectors of the same scale in both speed and accuracy. Furthermore, our RT-DETR-R50 achieves 53.1% AP and 108 FPS, outperforming DINO-Deformable-DETR-R50 by 2.2% AP in accuracy and by about 21 times in FPS.

Citation

If you use RT-DETR in your work, please use the following BibTeX entries:

@misc{lv2023detrs,
      title={DETRs Beat YOLOs on Real-time Object Detection},
      author={Wenyu Lv and Shangliang Xu and Yian Zhao and Guanzhong Wang and Jinman Wei and Cheng Cui and Yuning Du and Qingqing Dang and Yi Liu},
      year={2023},
      eprint={2304.08069},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}