Benchmarking KD
avishreekh opened this issue · 0 comments
avishreekh commented
We need to benchmark the following algorithms on three datasets (MNIST, CIFAR10, CIFAR100). This is so that we are sure that our implementations are fairly accurate on most datasets.
We also need to ensure that the distillation works with a variety of student networks. @Het-Shah has suggested that we report results on ResNet18, MobileNet v2 and ShuffleNet v2 as student networks. ResNet50 can be the teacher network for all the distillations.
- VanillaKD
- TAKD
- Noisy Teacher
- Attention
- BANN
- Bert2lstm
- RCO
- Messy Collab
- Soft Random
- CSKD
- DML
- Self-training
- Virtual Teacher
- RKD Loss
- KA/ProbShift
- KA/LabelSmoothReg
If you wish to work on any of the above algorithms, just mention the algorithms in the discussion.