/model-compression-and-acceleration-progress

Repository to track the progress in model compression and acceleration

Model Compression and Acceleration Progress

Repository to track the progress in model compression and acceleration

Low-rank approximation

  • T-Net: Parametrizing Fully Convolutional Nets with a Single High-Order Tensor (CVPR 2019) paper
  • MUSCO: Multi-Stage COmpression of neural networks (ICCVW 2019) paper | code (PyTorch)
  • Efficient Neural Network Compression (CVPR 2019) paper | code (Caffe)
  • Adaptive Mixture of Low-Rank Factorizations for Compact Neural Modeling (ICLR 2019) paper | code (PyTorch)
  • Extreme Network Compression via Filter Group Approximation (ECCV 2018) paper
  • Ultimate tensorization: compressing convolutional and FC layers alike (NIPS 2016 workshop) paper | code (TensorFlow) | code (MATLAB, Theano + Lasagne)
  • Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications (ICLR 2016) paper
  • Accelerating Very Deep Convolutional Networks for Classification and Detection (IEEE TPAMI 2016) paper
  • Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition (ICLR 2015) paper | code (Caffe)
  • Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation (NIPS 2014) paper
  • Speeding up Convolutional Neural Networks with Low Rank Expansions (2014) paper

Pruning & Sparsification

Papers

Repos

Knowledge distillation

Papers

  • Learning Efficient Detector with Semi-supervised Adaptive Distillation (arxiv 2019) paper | code (Caffe)
  • Model compression via distillation and quantization (ICLR 2018) paper | code (Pytorch)
  • Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks (ICLR 2018 workshop) paper
  • Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks ( BMVC 2018) paper
  • Net2Net: Accelerating Learning via Knowledge Transfer (ICLR 2016) paper
  • Distilling the Knowledge in a Neural Network (NIPS 2014) paper
  • FitNets: Hints for Thin Deep Nets (2014) paper | code (Theano + Pylearn2)

Repos

TensorFlow implementation of three papers https://github.com/chengshengchan/model_compression, results for CIFAR-10

Quantization

  • XNOR-Net++ (2019) paper
  • Matrix and tensor decompositions for training binary neural networks (2019) paper
  • XNOR-Net (ECCV 2016) paper | code (Pytorch)

Architecture search

  • MobileNets
  • EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks (ICML 2019) paper | code and pretrained models (TensorFlow)
  • MnasNet: Platform-Aware Neural Architecture Search for Mobile (CVPR 2019) paper | code (TensorFlow)
  • MorphNet: Fast & Simple Resource-Constrained Learning of Deep Network Structure (CVPR 2018) paper | code (TensorFlow)
  • ShuffleNets
    • ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design (ECCV 2018) paper
    • ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices (CVPR 2018) paper
  • Multi-Fiber Networks for Video Recognition (ECCV 2018) paper | code (PyTorch)
  • IGCVs
    • IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks (BMVC 2018) paper | code and pretrained models (MXNet)
    • IGCV2: Interleaved Structured Sparse Convolutional Neural Networks (CVPR 2018) paper
    • Interleaved Group Convolutions for Deep Neural Networks (ICCV 2017) paper

PhD thesis and overviews

  • Algorithms for speeding up convolutional neural networks (2018) thesis
  • Model Compression and Acceleration for Deep Neural Networks: The Principles, Progress, and Challenges (2018) paper
  • Efficient methods and hardware for deep learning (2017) thesis

Frameworks

  • MUSCO - framework for model compression using tensor decompositions (PyTorch, TensorFlow)
  • Distiller - package for compression using pruning and low-precision arithmetic (PyTorch)
  • MorphNet - framework for neural networks architecture learning (TensorFlow)
  • Mayo - deep learning framework with fine- and coarse-grained pruning, network slimming, and quantization methods
  • PocketFlow - framework for model pruning, sparcification, quantization (TensorFlow implementation)
  • Keras compressor - compression using low-rank approximations, SVD for matrices, Tucker for tensors.
  • Caffe compressor K-means based quantization

Comparison of different approaches

Please, see comparative_results.pdf

Similar repos