🚀🚀 Model Compression and Model Acceleration in TensorLayer

Question

🚀🚀 Model Compression and Model Acceleration in TensorLayer

zsdonghao opened this issue 6 years ago · 7 comments

http://machinethink.net/blog/compressing-deep-neural-nets/

------- before 12 Aug 2018 -------

SqueezeNet, see tutorial_squeezenet.py
MobileNet, see tutorial_mobilenet.py
BinaryNet, see mnist cifar10.
Ternary Weight Network, see mnist cifar10.
DoReFa-Net, see mnist cifar10.

@XJTUWYD add two compress strategies tenary weight network and dorefa-net into tensorlayer, did two experiments to compare the accuracy of different compress strategies based on mnist and cifar-10.
the result of the experiment is below:

	BinaryNet	Tenary Weight	DoReFa-Net
MNIST	98.86%	99.27%	98.89%
CIFAR10	41.1%	80.6%	81.1%

@XJTUWYD : bnn is a excellent work in the compression of neuron network but it can not get a satisfied accuracy on relative large datasets, in order to solve the problem, tenary weight networks and dorefa were put forward. I add 4 apis for tensorlayer, Tenary Denselayer, TenaryConv2d, DorefaDenselayer, and DorefaConv2d . I perform 6 experiment based on mnist and cifar10，the details are in thr tutorials. Finally thank you very much for the help of HaoDong, LuoMai, and Igarithm.

------- before 15 March 2018 -------

Hi, I am trying to make TensorLayer/ TensorFlow to support BinaryNet, Xnor-Net, SqueezeNet, MobileNet, ShuffleNet, DoReFaNet, Channel Pruning and etc. Feel free to discuss and add more information here. ~

Paper List

A Survey of Model Compression and Acceleration for Deep Neural Networks (end of 2017)
More in arXiv-Sanity

1. Quantization

Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations paper
Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 paper itayhubara/BinaryNet.tf
Ternary Weight Networks paper
XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks paper AngusG/tensorflow-xnor-bnn
DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients paper
Towards Accurate Binary Convolutional Neural Network paper

2. Pruning

Channel Pruning for Accelerating Very Deep Neural Networks paper

3. Structure

SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size paper TensorLayer example
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications paper
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices paper code1 code2
纵览轻量化卷积神经网络：SqueezeNet、MobileNet、ShuffleNet、Xception
CNN 模型压缩与加速算法综述

About TensorLayer & TensorFlow

Others

Answer 1 · 2018-03-17T17:36:21.000Z

Release 1.8.2

https://github.com/tensorlayer/tensorlayer/releases/tag/1.8.2

This is an experimental API package for building Binary Nets. We are using matrix multiplication rather than add-minus and bit-count operation at the moment. Therefore, these APIs would not speed up the inferencing, for production, you can train model via TensorLayer and deploy the model into other customized C/C++ implementation (We probably provide users an extra C/C++ binary net framework that can load model from TensorLayer).

TODO

For binary input, use XNOR, bit-count operation to replace matrix multiplication (i.e. dot production).
For non binary input, use add and minus operation to replace matrix multiplication.

Answer 2 · 2018-03-19T11:25:17.000Z

Release SqueezeNet Example

🚀 https://github.com/tensorlayer/tensorlayer/blob/master/example/tutorial_squeezenet.py

About MobileNet and ShuffleNet

Q: I found a problem of MobileNet and ShuffleNet: they use depthwise convolution, but TensorFlow's tf.nn.depthwise_conv2d and tf.nn.separable_conv2d kernel are very slow , so in practice, it seems that these methods could not speed up a lot? tensorflow/tensorflow#12940

A: TensorFlow 1.5 solved this problem, these operator run faster now.

Answer 3 · 2018-03-22T15:43:43.000Z

Release DoReFa-Net, BinaryNet, Ternary Weight Network 🚀

SqueezeNet (ImageNet). Classification task, see tutorial_squeezenet.py
BinaryNet. Model compression, see mnist cifar10.
Ternary Weight Network. Model compression, see mnist cifar10.
DoReFa-Net. Model compression, see mnist cifar10.

Answer 4 · 2018-03-22T20:18:18.000Z

Release MobileNetV1 (3 time faster than Keras)

https://github.com/tensorlayer/tensorlayer/blob/master/example/tutorial_mobilenet.py

Note that, TensorLayer MobileNet is 3 time faster than Keras (same weights and architecture).
TensorLayer takes 0.001~0.002 second for one image on Titan XP.
Keras takes 0.004~0.005 second for one image on Titan XP.

import keras
keras_model = keras.applications.mobilenet.MobileNet(input_shape=None, alpha=1.0, depth_multiplier=1, dropout=1e-3, include_top=True, weights='imagenet', input_tensor=None, pooling=None, classes=1000)
prob = keras_model.predict(np.asarray([img]), batch_size=1)

TODO MobileNetV2

Answer 5 · 2018-03-23T15:33:50.000Z

Which weights? I'm getting bad predictions for 'tiger.jpeg' :
('Predicted :', [(u'n09472597', u'volcano', 0.20799732), (u'n03388043', u'fountain', 0.1508058), (u'n06874185', u'traffic_light', 0.12433123)])

Answer 6 · 2018-03-23T15:41:20.000Z

@filipetrocadoferreira check the weights here https://github.com/tensorlayer/pretrained-models

Answer 7 · 2018-03-23T15:41:44.000Z

I'm using mobilenet.npz