/GuidedDistillation

Official implementation of the paper "Guided Distillation for Semi-Supervised Instance Segmentation".

Primary LanguagePythonOtherNOASSERTION

Guided Distillation for Semi-Supervised Instance Segmentation

Tariq Berrada, Camille Couprie, Karteek Alahari, Jakob Verbeek.

[WACV website][arXiv] [BibTeX]


Guided Distillation is a semi-supervised training methodology for instance segmentation building on the Mask2Former model. It achieves substantial improvements with respect to the previous state-of-the-art in terms of mask-AP. Most notably, our method outperforms the fully supervised Mask-RCNN COCO baseline while using only 2% of the annotations. Using a ViT (DinoV2) backbone, our method achieves 31.0 mask-AP while using 0.4% of annotations only.

Our implementation is based on detectron2 and provides support for both COCO and Cityscapes with multiple backbones such as R50, Swin, ViT (DETR) and ViT (DinoV2).

Features

  • Semi-supervised distillation training for instance segmentation with different percentages of labeled data.
  • Tested on both COCO and Cityscapes.
  • Support for R50, Swin, ViT (DETR) and ViT (DinoV2) backbones.

Installation

Our codebase is based on detectron2 and Mask2Former. An example of environment installing useful dependencies is provided in install.md.

Prepare Datasets for Mask2Former

For experiments with different amounts of labeled data, you will need to generate annotation structures for each of the percentages you want to use in your experiments, please follow the instructions at tools/datasets for this.

Model Training

Training is split into two consecutive steps.

  • Pre-training : Train the model using the available labeled data.
  • Burn-in and distillation : Train the student model using both labeled and unlabeled samples with targets provided by the teacher.

The following section provides examples of scripts to launch for different use cases.

Example on Cityscapes with a R50 backbone

Example with R50 for cityscapes with 5% of labeled data on 2 GPUs.

  • Train teacher model with available labeled data only :

    python3 train_net.py --config-file ./configs/cityscapes/instance-segmentation/maskformer2_R50_bs16_90k.yaml --num-gpus 2 --num-machines 1 SSL.PERCENTAGE 5 SSL.TRAIN_SSL False OUTPUT_DIR *OUTPUT/DIR/TEACHER*
    
  • Train semi-supervised model using pretrained checkpoint

    python3 train_net.py --config-file ./configs/cityscapes/instance-segmentation/maskformer2_R50_bs16_90k.yaml --num-gpus 2 --num-machines 1 SSL.PERCENTAGE 5 SSL.TRAIN_SSL True SSL.TEACHER_CKPT *PATH/TO/CKPT* OUTPUT_DIR *OUTPUT/DIR/STUDENT* SSL.BURNIN_ITER *NB_ITER*
    

For Swin backbones, pretrained weights are expected to be downloaded and converted according to Mask2Former's scripts.

Example on COCO with a DINO backbone

Example with DINOv2 backbones, 0.4% of labeled data (for percentage below 1%, the argument takes 100/percentage=250 in this case):

  • Train teacher model with available labeled data only :

    python3 -W ignore train_net.py --config-file ./configs/coco/instance-segmentation/dinov2/maskformer2_dinov2_large_bs16_50ep.yaml --num-gpus 2 --num-machines 1 SSL.PERCENTAGE 250 SSL.TRAIN_SSL False OUTPUT_DIR *OUTPUT/DIR/TEACHER*
    
  • Train semi-supervised model using pretrained checkpoint

    python3 -W ignore train_net.py --config-file ./configs/coco/instance-segmentation/dinov2/maskformer2_dinov2_large_bs16_50ep.yaml --num-gpus 2 --num-machines 1 SSL.PERCENTAGE 250 SSL.TRAIN_SSL True SSL.TEACHER_CKPT *PATH/TO/CKPT* OUTPUT_DIR *OUTPUT/DIR/STUDENT* SSL.BURNIN_ITER *NB_ITER*
    

You can choose to evaluate either the teacher, the student or both models during semi-supervised training. To this end you can add the SSL.EVAL_WHO argument to your script and set it to either STUDENT (default), TEACHER or BOTH.

Citing Guided Distillation

If you use Guided Distillation in your research or wish to refer to the baseline results published in the manuscript, please use the following BibTeX entry.

@InProceedings{Berrada_2024_WACV,
    author    = {Berrada, Tariq and Couprie, Camille and Alahari, Karteek and Verbeek, Jakob},
    title     = {Guided Distillation for Semi-Supervised Instance Segmentation},
    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    month     = {January},
    year      = {2024},
    pages     = {475-483}
}

Acknowledgement

Code is largely based on MaskFormer and Detectron2.

See the CONTRIBUTING file for how to help out.

License

"Guided Distillation for Semi-Supervised Instance Segmentation" is licensed, as found in the LICENSE file.