SHKD

This repo provides a demo for the paper "Soft Hybrid Knowledge Distillation against Deep Neural Networks" on the CIFAR-100 dataset.

Requirements

Download datasets and extract it inside data
Teacher Training: python teacher.py --arch wrn_40_2 --lr 0.05 --gpu-id 0
Student Training: python student.py --t-path ./experiments/teacher_wrn_40_2_seed0/ --s-arch wrn_16_2 --lr 0.05 --gpu-id 0
Evaluate Sample:
- Distillation model of VGG-13 and MobileNetV2 for CIFAR-100 are available at this link. Download and extract them in the experiments directory.
- You should achieve 71.95% on CIFAR-100 datasets.