/Semi-DETR

Primary LanguagePythonMIT LicenseMIT

CVPR2023 Semi-DETR: Semi-Supervised Object Detection with Detection Transformers

This repo is the official implementation of CVPR2023 paper "Semi-DETR: Semi-Supervised Object Detection with Detection Transformers". Semi-DETR is the first work on semi-supervised object detection designed for detection transformers.

Usage

Our code is based on the awesome codebase provided by Soft-Teacher[1].

Requirements

  • Ubuntu 18.04
  • Anaconda3 with python=3.8
  • Pytorch=1.9.0
  • mmdetection=2.16.0+fe46ffe
  • mmcv=1.3.16
  • cuda=10.2

Installation

Ths project is developed based on mmdetection, please install the mmdet in a editable mode first:

cd thirdparty/mmdetection && python -m pip install -e .

Following the mmdetection, we also develop our detection transformer module and semi-supervised module in the similar way, which needs to be installed first(Please change the module name('detr_od' and 'detr_ssod') in 'setup.py' file alter):

cd ../../ && python -m pip install -e .

These will install 'mmdet', 'detr_od' and 'detr_ssod' in our environment. It also needs to compile the CUDA ops for deformable attention:

cd detr_od/models/utils/ops
python setup.py build install
# unit test (should see all checking is True)(Optional)
python test.py
cd ../../..

Data Preparation

  • Download the COCO dataset
  • Execute the following command to generate data set splits:
# YOUR_DATA should be a directory contains coco dataset.
# For eg.:
# YOUR_DATA/
#  coco/
#     train2017/
#     val2017/
#     unlabeled2017/
#     annotations/
ln -s ${YOUR_DATA} data
bash tools/dataset/prepare_coco_data.sh conduct

For concrete instructions of what should be downloaded, please refer to tools/dataset/prepare_coco_data.sh line 11-24. You can also download our generated semi-supervised data set splits in semi-coco-splits.

  • Download the PASCAL VOC dataset
  • Execute the following command to generate data set splits:
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
tar -xf VOCtrainval_06-Nov-2007.tar
tar -xf VOCtest_06-Nov-2007.tar
tar -xf VOCtrainval_11-May-2012.tar

# resulting format
# YOUR_DATA/
#   - VOCdevkit
#     - VOC2007
#       - Annotations
#       - JPEGImages
#       - ...
#     - VOC2012
#       - Annotations
#       - JPEGImages
#       - ...

Following prior works, we convert the PASCAL VOC dataset into COCO format and evaluate the performance of model with coco-style mAP. Execute the following command to convert the dataset format:

python scripts/voc_to_coco.py --devkit_path ${VOCdevkit-PATH} --out-dir ${VOCdevkit-PATH}

Training

  • To train model on the fully supervised setting(Optional):

We implement the DINO with mmdetection following the original official repo, if you want to train the fully supervised DINO model by youself and check our implementation, you can run:

sh tools/dist_train_detr_od.sh dino_detr 8

It would train the DINO with batch size 16 for 12 epochs. We also provide the resulted checkpoint dino_sup_12e_ckpt and our training log dino_sup_12e_log of this fully supervised model.

  • To train model on the partial labeled data setting:
sh tools/dist_train_detr_ssod.sh dino_detr_ssod ${FOLD} ${PERCENT} ${GPUS}

For example, you can run the following scripts to train our model on 10% labeled data with 8 GPUs on 1th split:

sh tools/dist_train_detr_ssod.sh dino_detr_ssod 1 10 8
  • To train model on the full labeled data setting:
sh tools/dist_train_detr_ssod_coco_full.sh <NUM_GPUS>

For example, to train ours R50 model with 8 GPUs:

sh tools/dist_train_detr_ssod_coco_full.sh 8

Evaluation

python tools/test.py <CONFIG_FILE_PATH> <CHECKPOINT_PATH> --eval bbox

We also prepare some models trained by us bellow:

COCO:

Setting mAP Weights
1% Data 30.50 $\pm$ 0.30 ckpt
5% Data 40.10 $\pm$ 0.15 ckpt
10% Data 43.5 $\pm$ 0.10 ckpt
Full Data 50.5 ckpt

VOC:

Setting AP50 mAP Weights
Unlabel: VOC12 86.1 65.2 ckpt

[1] End-to-End Semi-Supervised Object Detection with Soft Teacher

Citation

If you find our repo useful for your research, please cite us:

@inproceedings{zhang2023semi,
  title={Semi-DETR: Semi-Supervised Object Detection With Detection Transformers},
  author={Zhang, Jiacheng and Lin, Xiangru and Zhang, Wei and Wang, Kuo and Tan, Xiao and Han, Junyu and Ding, Errui and Wang, Jingdong and Li, Guanbin},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={23809--23818},
  year={2023}
}