Cross-Image Relational Knowledge Distillation for Semantic Segmentation
This repository contains the source code of CIRKD (Cross-Image Relational Knowledge Distillation for Semantic Segmentation).
Requirement
Ubuntu 18.04 LTS
Python 3.8 (Anaconda is recommended)
CUDA 11.1
PyTorch 1.8.0
NCCL for CUDA 11.1
Backbones pretrained on ImageNet:
Performance on Cityscapes
All models are trained over 8 * NVIDIA GeForce RTX 3090
If you want to use mixed precision training, please add --fp16
, see fp16 demo demo1, demo2, demo3.
Role | Network | Method | Val mIoU | test mIoU | Pretrained | train script |
---|---|---|---|---|---|---|
Teacher | DeepLabV3-ResNet101 | - | 78.07 | 77.46 | Google Drive | sh |
Student | DeepLabV3-ResNet18 | Baseline | 74.21 | 73.45 | - | sh |
Student | DeepLabV3-ResNet18 | CIRKD | 76.38 | 75.05 | Google Drive | sh |
Student | DeepLabV3-MobileNetV2 | Baseline | 73.12 | 72.36 | - | sh |
Student | DeepLabV3-MobileNetV2 | CIRKD | 75.42 | 74.03 | Google Drive | sh |
Student | PSPNet-ResNet18 | Baseline | 72.55 | 72.29 | - | sh |
Student | PSPNet-ResNet18 | CIRKD | 74.73 | 74.05 | Google Drive | sh |
Performance of Segmentation KD methods on Cityscapes
Method | Val mIoU | Val mIoU | train script |
---|---|---|---|
Teacher | DeepLabV3-ResNet101 | DeepLabV3-ResNet101 | |
Baseline | 78.07 | 78.07 | |
Student | DeepLabV3-ResNet18 | DeepLabV3-MobileNetV2 | |
Baseline | 74.21 | 73.12 | |
SKD [3] | 75.42 | 73.82 | sh |
IFVD [4] | 75.59 | 73.50 | sh |
CWD [5] | 75.55 | 74.66 | sh |
DSD [6] | 74.81 | 74.11 | sh |
CIRKD [7] | 76.38 | 75.42 |
The references are shown in references.md
Evaluate pre-trained models on Cityscapes test sets
You can run test_cityscapes.sh. You can zip the resulting images and submit it to the Cityscapes test server.
Note: The current codes have been reorganized and we have not tested them thoroughly. If you have any questions, please contact us without hesitation.
Performance of Segmentation KD methods on Pascal VOC
The Pascal VOC dataset for segmentation is available at Baidu Drive
Role | Network | Method | Val mIoU | train script |
---|---|---|---|---|
Teacher | DeepLabV3-ResNet101 | - | 77.67 | sh |
Student | DeepLabV3-ResNet18 | Baseline | 73.21 | sh |
Student | DeepLabV3-ResNet18 | CIRKD | 74.50 | sh |
Student | PSPNet-ResNet18 | Baseline | 73.33 | sh |
Student | PSPNet-ResNet18 | CIRKD | 74.78 | sh |
Performance of Segmentation KD methods on CamVid
The CamVid dataset for segmentation is available at Baidu Drive
Role | Network | Method | Val mIoU | train script |
---|---|---|---|---|
Teacher | DeepLabV3-ResNet101 | - | 69.84 | sh |
Student | DeepLabV3-ResNet18 | Baseline | 66.92 | sh |
Student | DeepLabV3-ResNet18 | CIRKD | 68.21 | sh |
Student | PSPNet-ResNet18 | Baseline | 66.73 | sh |
Student | PSPNet-ResNet18 | CIRKD | 68.65 | sh |
Citation
@inproceedings{yang2022cross,
title={Cross-image relational knowledge distillation for semantic segmentation},
author={Yang, Chuanguang and Zhou, Helong and An, Zhulin and Jiang, Xue and Xu, Yongjun and Zhang, Qian},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={12319--12328},
year={2022}
}