/Knowledge-Distillation-Paper

This resposity maintains a series of papers on knowledge distillation.

Knowledge-Distillation-Paper

This resposity maintains a series of papers, especially on knowledge distillation.

Early Works on Knowledge Distillation

  • Model Compression, KDD 2006 [Paper]

    • Cristian Buciluǎ, Rich Caruana, Alexandru Niculescu-Mizil.
  • Do Deep Nets Really Need to be Deep?, NIPS 2014 [Paper]

    • Lei Jimmy Ba, Rich Caruana.
  • Distilling the Knowledge in a Neural Network, NIPS-workshop 2014 [Paper]

    • Geoffrey Hinton, Oriol Vinyals, Jeff Dean.

Feature Distillation

  • FitNets: Hints for Thin Deep Nets, ICLR 2015 [Paper] [Theano]

    • Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, Yoshua Bengio.
  • Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer, ICLR 2017 [Paper] [PyTorch]

    • Sergey Zagoruyko, Nikos Komodakis.
  • Learning Deep Representations with Probabilistic Knowledge Transfer, ECCV 2018 [Paper] [Pytorch]

    • Nikolaos Passalis, Anastasios Tefas.
  • Knowledge Distillation via Instance Relationship Graph, CVPR 2019 [Paper] [Caffe]

    • Yufan Liu, Jiajiong Cao, Bing Li, Chunfeng Yuan, Weiming Hu, Yangxi Li and Yunqiang Duan.
  • Relational Knowledge Distillation, CVPR 2019 [Paper] [Pytorch]

    • Wonpyo Park, Dongju Kim, Yan Lu, Minsu Cho.
  • Similarity-Preserving Knowledge Distillation, CVPR 2019 [Paper]

    • Frederick Tung, Greg Mori.
  • Variational Information Distillation for Knowledge Transfer, CVPR 2019 [Paper]

    • Sungsoo Ahn, Shell Xu Hu, Andreas Damianou, Neil D. Lawrence, Zhenwen Dai.
  • Contrastive Representation Distillation, ICLR 2020 [Paper] [Pytorch]

    • Yonglong Tian, Dilip Krishnan, Phillip Isola.
  • Heterogeneous Knowledge Distillation using Information Flow Modeling, CVPR 2020 [Paper] [Pytorch]

    • Nikolaos Passalis, Maria Tzelepi, Anastasios Tefas.
  • Matching Guided Distillation, ECCV 2020 [Paper] [Pytorch]

    • Kaiyu Yue, Jiangfan Deng, Feng Zhou.
  • Cross-Layer Distillation with Semantic Calibration, AAAI 2021 [Paper] [Pytorch][TKDE]

    • Defang Chen, Jian-Ping Mei, Yuan Zhang, Can Wang, Zhe Wang, Yan Feng, Chun Chen.
  • Distilling Holistic Knowledge with Graph Neural Networks, ICCV 2021 [Paper] [Pytorch]

    • Sheng Zhou, Yucheng Wang, Defang Chen, Jiawei Chen, Xin Wang, Can Wang, Jiajun Bu.
  • Knowledge Distillation with the Reused Teacher Classifier, CVPR 2022 [Paper] [Pytorch]

    • Defang Chen, Jian-Ping Mei, Hailin Zhang, Can Wang, Yan Feng, Chun Chen.

Online Knowledge Distillation

  • Deep Mutual Learning, CVPR 2018 [Paper] [TensorFlow]

    • Ying Zhang, Tao Xiang, Timothy M. Hospedales, Huchuan Lu.
  • Large scale distributed neural network training through online distillation, ICLR 2018 [Paper]

    • Rohan Anil, Gabriel Pereyra, Alexandre Passos, Robert Ormandi, George E. Dahl and Geoffrey E. Hinton.
  • Collaborative Learning for Deep Neural Networks, NIPS 2018 [Paper]

    • Guocong Song, Wei Chai.
  • Knowledge Distillation by On-the-Fly Native Ensemble, NIPS 2018 [Paper] [PyTorch]

    • Xu Lan, Xiatian Zhu, Shaogang Gong.
  • Online Knowledge Distillation with Diverse Peers, AAAI 2020 [Paper] [Pytorch]

    • Defang Chen, Jian-Ping Mei, Can Wang, Yan Feng and Chun Chen.
  • Online Knowledge Distillation via Collaborative Learning, CVPR 2020 [Paper]

    • Qiushan Guo, Xinjiang Wang, Yichao Wu, Zhipeng Yu, Ding Liang, Xiaolin Hu, Ping Luo.

Multi-Teacher Knowledge Distillation

Homogenous Label Space

  • Distilling knowledge from ensembles of neural networks for speech recognition, INTERSPEECH 2016 [Paper]

    • Austin Waters, Yevgen Chebotar.
  • Efficient Knowledge Distillation from an Ensemble of Teachers, INTERSPEECH 2017 [Paper]

    • Takashi Fukuda, Masayuki Suzuki, Gakuto Kurata, Samuel Thomas, Jia Cui, Bhuvana Ramabhadran.
  • Learning from Multiple Teacher Networks, KDD 2017 [Paper]

    • Shan You, Chang Xu, Chao Xu, Dacheng Tao.
  • Multi-teacher Knowledge Distillation for Compressed Video Action Recognition on Deep Neural Networks, ICASSP 2019 [Paper]

    • Meng-Chieh Wu, Ching-Te Chiu, Kun-Hsuan Wu.
  • Agree to Disagree: Adaptive Ensemble Knowledge Distillation in Gradient Space, NIPS 2020 [Paper] [Pytorch]

    • Shangchen Du, Shan You, Xiaojie Li, Jianlong Wu, Fei Wang, Chen Qian, Changshui Zhang.
  • Adaptive Knowledge Distillation Based on Entropy, ICASSP 2020 [Paper]

    • Kisoo Kwon, Hwidong Na, Hoshik Lee, Nam Soo Kim.
  • Reinforced Multi-Teacher Selection for Knowledge Distillation, AAAI 2021 [Paper]

    • Fei Yuan, Linjun Shou, Jian Pei, Wutao Lin, Ming Gong, Yan Fu, Daxin Jiang
  • Adaptive Distillation: Aggregating Knowledge from Multiple Paths for Efficient Distillation, BMVC 2021 [Paper] [Pytorch]

    • Sumanth Chennupati, Mohammad Mahdi Kamani, Zhongwei Cheng, Lin Chen
  • Confidence-Aware Multi-Teacher Knowledge Distillation, ICASSP 2022 [Paper] [Pytorch]

    • Hailin Zhang, Defang Chen, Can Wang.

Diffusion Distillation

  • Progressive Distillation for Fast Sampling of Diffusion Models, ICLR 2022 [Paper][Tensorflow]

    • Tim Salimans, Jonathan Ho
  • Accelerating Diffusion Sampling with Classifier-based Feature Distillation, Arxiv 2022.11 [Paper]

    • Wujie Sun, Defang Chen, Can Wang, Deshi Ye, Yan Feng, Chun Chen

Data-Free Knowledge Distillation

  • Data-Free Knowledge Distillation for Deep Neural Networks, NIPS-workshop 2017 [Paper] [Tensorflow]

    • Raphael Gontijo Lopes, Stefano Fenu, Thad Starner
  • DAFL: Data-Free Learning of Student Networks, ICCV 2019 [Paper] [PyTorch]

    • Hanting Chen, Yunhe Wang, Chang Xu, Zhaohui Yang, Chuanjian Liu, Boxin Shi, Chunjing Xu, Chao Xu, Qi Tian
  • Zero-Shot Knowledge Distillation in Deep Networks, ICML 2019 [Paper] [Tensorflow]

    • Gaurav Kumar Nayak, Konda Reddy Mopuri, Vaisakh Shaj, R. Venkatesh Babu, Anirban Chakraborty
  • Zero-shot Knowledge Transfer via Adversarial Belief Matching, NIPS 2019 [Paper] [Pytorch]

    • Paul Micaelli, Amos Storkey
  • Knowledge Extraction with No Observable Data, NIPS 2019 [Paper] [Pytorch]

    • Jaemin Yoo, Minyong Cho, Taebum Kim, U Kang
  • Dream Distillation: A Data-Independent Model Compression Framework, ICML-workshop 2019 [Paper]

    • Kartikeya Bhardwaj, Naveen Suda, Radu Marculescu
  • DeGAN : Data-Enriching GAN for Retrieving Representative Samples from a Trained Classifier, AAAI 2020 [Paper] [Pytorch]

    • Sravanti Addepalli, Gaurav Kumar Nayak, Anirban Chakraborty, R. Venkatesh Babu
  • Dreaming to Distill: Data-free Knowledge Transfer via DeepInversion, CVPR 2020 [Paper] [Pytorch]

    • Hongxu Yin, Pavlo Molchanov, Zhizhong Li, Jose M. Alvarez, Arun Mallya, Derek Hoiem, Niraj K. Jha, Jan Kautz
  • The Knowledge Within: Methods for Data-Free Model Compression, CVPR 2020 [Paper]

    • Matan Haroush, Itay Hubara, Elad Hoffer, Daniel Soudry
  • Data-Free Adversarial Distillation, ArXiv 2019.12 [Paper] [Pytorch]

    • Gongfan Fang, Jie Song, Chengchao Shen, Xinchao Wang, Da Chen, Mingli Song
    • Similar to NIPS-2019 Zero-shot Knowledge Transfer via Adversarial Belief Matching
  • Data-Free Knowledge Distillation with Soft Targeted Transfer Set Synthesis, AAAI 2021 [Paper]

    • Zi Wang
  • Learning Student Networks in the Wild, CVPR 2021 [Paper] [Pytorch]

    • Hanting Chen, Tianyu Guo, Chang Xu, Wenshuo Li, Chunjing Xu, Chao Xu, Yunhe Wang
  • Contrastive Model Inversion for Data-Free Knowledge Distillation, IJCAI 2021 [Paper] [Pytorch]

    • Gongfan Fang, Jie Song, Xinchao Wang, Chengchao Shen, Xingen Wang, Mingli Song

Useful Resources

  • Statistics of acceptance rate for the main AI conferences [Link]
  • AI conference deadlines [Link]

Accepted paper list