kaung-htet-myat/Multi-teachers-Knowledge-Distillation

Distilling knowledge from ensemble of multiple teacher networks to student network with multiple heads

Jupyter Notebook

Multi-teachers-Knowledge-Distillation

Distilling knowledge from ensemble of multiple teacher networks to student network with multi-head

Reference Papers:

Hydra: Preserving Ensemble Diversity for Model Distillation (https://arxiv.org/abs/2001.04694)
Distilling the Knowledge in a Neural Network (https://arxiv.org/abs/1503.02531)

Reference Implementations:

Keras Resnet Implementations (https://keras.io/examples/cifar10_resnet)
Resnet Training Procedures from Ko Ye Yint Htoon (https://github.com/yeyinthtoon/Knowledge-Distillation-ResNet)
Part of Knowledge Distillation Procedure from (https://devopedia.org/knowledge-distillation)

Dataset: CIFAR-10