Transformer Knowledge Distillation for Efficient Semantic Segmentation [arxiv]

Structure: TransKD

Introduction

We propose the structural framework, TransKD, to distill the knowledge from feature maps and patch embeddings of vision transformers.

Requirements

Environment: create a conda environment and activate it

conda create -n TransKD python=3.6
conda activate TransKD

Additional python pachages: poly scheduler and

pytorch == 1.7.1+cu92
torchvision == 0.8.2+cu92
mmsegmentation == 0.15.0
mmcv-full == 1.3.10
numpy
visdom

Datasets:

Cityscapes: download gtFine_trainvaltest.zip and leftImg8bit_trainvaltest.zip from cityscapes official website, then prepare the 19-class label with the createTrainIdLabelImgs.py from cityscapesscripts.

Usage

download teacher checkpoints in the folder checkpoints/.

Example:

python train/train_transkd.py --datadir /path/to/data --kdtype TransKD-Base

Publication

If you find this repo useful, please consider referencing the following paper [PDF]: