/knowledge-distillation

Neural networks compression via knowledge distillation technique

Primary LanguageJupyter NotebookMIT LicenseMIT

knowledge-distillation

This repository contains code for experimenting with knowledge distillation NN compression technique. It was initially proposed in the article. By now pipelines for data processing and model training are implemented for Imagewoof dataset, but the framework is compatible with other datasets. Each experiment corresponds to one commit in the repository. This way the code for reproducing the results is easily accesible for every experiment. The most significant results are collected in Experiment takeaway section of this README.

Contents

  • datasets.py - contains code to load data in a pytorch compatible format
  • models.py - models I use in my experiments and distillation loss from the article
  • training.py - train and evaluation pipelines
  • experiments.ipynb - contains code, hyperparams and graphics for current experiment
  • report.ipynb - a detailed explanation of my recent experiments (with visualisation). Can be launched in Google Colab

Requirements

  • Python 3.6.9
  • CUDA 10.1
  • Nvidia Driver 418.67
  • Python packages listed in requirements.txt

Install

There are two ways of running the code:

  1. In Google Colab. To start upload report.ipynb file to Colab.

  2. On local machine. Fulfil requirements and install Python packages:

     git clone https://github.com/stdereka/knowledge-distillation.git
     cd knowledge-distillation
     pip install -r requirements.txt
    

Experiment takeaway

Temperature search

Teacher Model Student Model Dataset Alpha T Accuracy
(Distilled)
Accuracy
(No Teacher)
Code
resnet101_teacher resnet18_student2 Imagewoof 0.1 1.0 0.9253 0.9262 link
2.0 0.9284
3.0 0.9298
4.0 0.9306
5.0 0.9303
6.0 0.9295
7.0 0.9284
8.0 0.9284

References

  1. Distilling the Knowledge in a Neural Network
  2. An Embarrassingly Simple Approach for Knowledge Distillation
  3. Experiments on CIFAR-10 and MNIST datasets
  4. Imagewoof dataset
  5. Awesome Knowledge Distillation