/knowledge-distillation

Improve performance by learning from teacher model. not label from train data

Primary LanguagePythonMIT LicenseMIT

Knowledge Distilation (Paper)

Knowledge distilation is a kind of transfer learning that learn from a larger pretrained model

The learning process of knowledge distillation is similar to the human beings learning

Students learn by watching and copying how teachers do it

If a teacher has a better ability, students will have a better ability too

On the contrary, if a teacher lacks ability, a student cannot produce good ability

We implemented response based offline distilation mentioned in the paper

The target training value of the student model is the predicted value of the teacher model

It does not matter whether the model to be trained is a classification model, an object detection model, or an RNN model

The preprocessing logic to build the training tensor is also not required at all

All you need is a teacher model and a student model with the same input shape and output shape