DAMO-YOLO: a fast and accurate object detection method with some new techs, including NAS backbones, efficient RepGFPN, ZeroHead, AlignedOTA, and distillation enhancement.

Primary LanguagePythonApache License 2.0Apache-2.0

English | 简体中文


Welcome to DAMO-YOLO! It is a fast and accurate object detection method, which is developed by TinyML Team from Alibaba DAMO Data Analytics and Intelligence Lab. And it achieves a higher performance than state-of-the-art YOLO series. DAMO-YOLO is extend from YOLO but with some new techs, including Neural Architecture Search (NAS) backbones, efficient Reparameterized Generalized-FPN (RepGFPN), a lightweight head with AlignedOTA label assignment, and distillation enhancement. For more details, please refer to our Arxiv Report. Moreover, here you can find not only powerful models, but also highly efficient training strategies and complete tools from training to deployment.


  • [2022/11/27: We release DAMO-YOLO v0.1.0!]
    • Release DAMO-YOLO object detection models, including DAMO-YOLO-T, DAMO-YOLO-S and DAMO-YOLO-M.
    • Release model convert tools for easy deployment, supports onnx and TensorRT-fp32, TensorRT-fp16.

Web Demo

  • DAMO-YOLO-S is integrated into ModelScope. Try out the Web Demo.

Model Zoo

Model size mAPval
Latency T4
DAMO-YOLO-T 640 41.8 2.78 18.1 8.5 torch,onnx
DAMO-YOLO-T* 640 43.0 2.78 18.1 8.5 torch,onnx
DAMO-YOLO-S 640 45.6 3.83 37.8 16.3 torch,onnx
DAMO-YOLO-S* 640 46.8 3.83 37.8 16.3 torch,onnx
DAMO-YOLO-M 640 48.7 5.62 61.8 28.2 torch,onnx
DAMO-YOLO-M* 640 50.0 5.62 61.8 28.2 torch,onnx
  • We report the mAP of models on COCO2017 validation set, with multi-class NMS.
  • The latency in this table is measured without post-processing.
  • * denotes the model trained with distillation.

Quick Start


Step1. Install DAMO-YOLO.

git clone https://github.com/tinyvision/damo-yolo.git
conda create -n DAMO-YOLO python=3.7 -y
conda activate DAMO-YOLO
conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=10.2 -c pytorch
pip install -r requirements.txt

Step2. Install pycocotools.

pip3 install cython;
pip3 install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'

Step1. Download a pretrained torch model or onnx engine from the benchmark table, e.g., damoyolo_tinynasL25_S.pth or damoyolo_tinynasL25_S.onnx.

Step2. Use -f(config filename) to specify your detector's config. For example:

# torch
python tools/torch_inference.py -f configs/damoyolo_tinynasL25_S.py --ckpt /path/to/your/damoyolo_tinynasL25_S.pth --path assets/dog.jpg

# onnx
python tools/onnx_inference.py -f configs/damoyolo_tinynasL25_S.py --onnx /path/to/your/damoyolo_tinynasL25_S.onnx --path assets/dog.jpg
Reproduce our results on COCO

Step1. Prepare COCO dataset

cd <DAMO-YOLO Home>
ln -s /path/to/your/coco ./datasets/coco

Step 2. Reproduce our results on COCO by specifying -f(config filename)

python -m torch.distributed.launch --nproc_per_node=8 tools/train.py -f configs/damoyolo_tinynasL25_S.py
Finetune on your data

Step1. Prepare your customize data in COCO format, and make sure the dataset name ends with coco. The dataset structure should be organized as follows:

├── Custom_coco
│   ├── annotations
│   │   ├── instances_train2017.json
│   │   └── instances_val2017.json
│   ├── train2017
│   ├── val2017
│   ├── LICENSE
│   ├── README.txt

Step2. Add the data directoy into damo/config/paths_catalog.py. Customize your config file based on default configs, e.g., damoyolo_TinynasL25_S.py. Don't forget to add pretrained model by config.train.finetune_path='./damoyolo_TinynasL25_S.pth' and specify the learning_rate/training_epochs/datasets and other hyperparameters according to your data.

Step3. Start finetuning:

python -m torch.distributed.launch --nproc_per_node=8 tools/train.py -f configs/damoyolo_tinynasL25_S_finetune.py
python -m torch.distributed.launch --nproc_per_node=8 tools/eval.py -f configs/damoyolo_tinynasL25_S.py --ckpt /path/to/your/damoyolo_tinynasL25_S.pth
Customize tinynas backbone Step1. If you want to customize your own backbone, please refer to [MAE-NAS Tutorial for DAMO-YOLO](https://github.com/alibaba/lightweight-neural-architecture-search/blob/main/scripts/damo-yolo/Tutorial_NAS_for_DAMO-YOLO_cn.md). This is a detailed tutorial about how to obtain an optimal backbone under the budget of latency/flops.

Step2. After the searching process completed, you can replace the structure text in configs with it. Finally, you can get your own custom ResNet-like or CSPNet-like backbone after setting the backbone name to TinyNAS_res or TinyNAS_csp. Please notice the difference of out_indices between TinyNAS_res and TinyNAS_csp.

structure = self.read_structure('tinynas_customize.txt')
TinyNAS = { 'name'='TinyNAS_res', # ResNet-like Tinynas backbone
            'out_indices': (2,4,5)}
TinyNAS = { 'name'='TinyNAS_csp', # CSPNet-like Tinynas backbone
            'out_indices': (2,3,4)}



Step1. Install ONNX.

pip install onnx==1.8.1
pip install onnxruntime==1.8.0
pip install onnx-simplifier==0.3.5

Step2. Install CUDA、CuDNN、TensorRT and pyCUDA

2.1 CUDA

wget https://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_440.33.01_linux.run
sudo sh cuda_10.2.89_440.33.01_linux.run
export PATH=$PATH:/usr/local/cuda-10.2/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.2/lib64
source ~/.bashrc

2.2 CuDNN

sudo cp cuda/include/* /usr/local/cuda/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/
sudo chmod a+r /usr/local/cuda/include/cudnn.h
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*

2.3 TensorRT

cd TensorRT-
pip install tensorrt-

2.4 pycuda

pip install pycuda==2022.1
Model Convert

Step.1 convert torch model to onnx or trt engine, and the output file would be generated in ./deploy. end2end means to export trt with nms. trt_eval means to evaluate the exported trt engine on coco_val dataset after the export compelete.

# onnx export 
python tools/converter.py -f configs/damoyolo_tinynasL25_S.py -c damoyolo_tinynasL25_S.pth --batch_size 1 --img_size 640

# trt export
python tools/converter.py -f configs/damoyolo_tinynasL25_S.py -c damoyolo_tinynasL25_S.pth --batch_size 1 --img_size 640 --trt --end2end --trt_eval

Step.2 trt engine evaluation on coco_val dataset. end2end means to using trt_with_nms to evaluation.

python tools/trt_eval.py -f configs/damoyolo_tinynasL25_S.py -trt deploy/damoyolo_tinynasL25_S_end2end.trt --batch_size 1 --img_size 640 --end2end

Step.3 onnx or trt engine inference demo and appoint test image by -p. end2end means to using trt_with_nms to inference.

# onnx inference
python tools/onnx_inference.py -f configs/damoyolo_tinynasL25_S.py --onnx /path/to/your/damoyolo_tinynasL25_S.onnx --path assets/dog.jpg

# trt inference
python tools/trt_inference.py -f configs/damoyolo_tinynasL25_S.py -t deploy/damoyolo_tinynasL25_S_end2end_fp16_bs1.trt -p assets/dog.jpg --img_size 640 --end2end

Intern Recruitment

We are recruiting research intern, if you are interested in object detection, model quantization or NAS, please send your resume to xiuyu.sxy@alibaba-inc.com


If you use DAMO-YOLO in your research, please cite our work by using the following BibTeX entry:

   title={DAMO-YOLO: A Report on Real-Time Object Detection Design},
   author={Xianzhe Xu, Yiqi Jiang, Weihua Chen, Yilun Huang, Yuan Zhang and Xiuyu Sun},
   journal={arXiv preprint arXiv:2211.15444},