🚀🚀 Model Compression and Model Acceleration in TensorLayer
zsdonghao opened this issue · 7 comments
http://machinethink.net/blog/compressing-deep-neural-nets/
------- before 12 Aug 2018 -------
- SqueezeNet, see tutorial_squeezenet.py
- MobileNet, see tutorial_mobilenet.py
- BinaryNet, see mnist cifar10.
- Ternary Weight Network, see mnist cifar10.
- DoReFa-Net, see mnist cifar10.
@XJTUWYD add two compress strategies tenary weight network and dorefa-net into tensorlayer, did two experiments to compare the accuracy of different compress strategies based on mnist and cifar-10.
the result of the experiment is below:
BinaryNet | Tenary Weight | DoReFa-Net | |
---|---|---|---|
MNIST | 98.86% | 99.27% | 98.89% |
CIFAR10 | 41.1% | 80.6% | 81.1% |
@XJTUWYD : bnn is a excellent work in the compression of neuron network but it can not get a satisfied accuracy on relative large datasets, in order to solve the problem, tenary weight networks and dorefa were put forward. I add 4 apis for tensorlayer, Tenary Denselayer, TenaryConv2d, DorefaDenselayer, and DorefaConv2d . I perform 6 experiment based on mnist and cifar10,the details are in thr tutorials. Finally thank you very much for the help of HaoDong, LuoMai, and Igarithm.
------- before 15 March 2018 -------
Hi, I am trying to make TensorLayer/ TensorFlow to support BinaryNet, Xnor-Net, SqueezeNet, MobileNet, ShuffleNet, DoReFaNet, Channel Pruning and etc. Feel free to discuss and add more information here. ~
Paper List
- A Survey of Model Compression and Acceleration for Deep Neural Networks (end of 2017)
- More in arXiv-Sanity
1. Quantization
- Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations paper
- Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 paper itayhubara/BinaryNet.tf
- Ternary Weight Networks paper
- XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks paper AngusG/tensorflow-xnor-bnn
- DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients paper
- Towards Accurate Binary Convolutional Neural Network paper
2. Pruning
- Channel Pruning for Accelerating Very Deep Neural Networks paper
3. Structure
- SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size paper TensorLayer example
- MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications paper
- ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices paper code1 code2
- 纵览轻量化卷积神经网络:SqueezeNet、MobileNet、ShuffleNet、Xception
- CNN 模型压缩与加速算法综述
About TensorLayer & TensorFlow
Others
Release 1.8.2
https://github.com/tensorlayer/tensorlayer/releases/tag/1.8.2
This is an experimental API package for building Binary Nets. We are using matrix multiplication rather than add-minus and bit-count operation at the moment. Therefore, these APIs would not speed up the inferencing, for production, you can train model via TensorLayer and deploy the model into other customized C/C++ implementation (We probably provide users an extra C/C++ binary net framework that can load model from TensorLayer).
TODO
- For binary input, use XNOR, bit-count operation to replace matrix multiplication (i.e. dot production).
- For non binary input, use add and minus operation to replace matrix multiplication.
Release SqueezeNet Example
🚀 https://github.com/tensorlayer/tensorlayer/blob/master/example/tutorial_squeezenet.py
About MobileNet and ShuffleNet
Q: I found a problem of MobileNet and ShuffleNet: they use depthwise convolution, but TensorFlow's tf.nn.depthwise_conv2d
and tf.nn.separable_conv2d
kernel are very slow , so in practice, it seems that these methods could not speed up a lot? tensorflow/tensorflow#12940
A: TensorFlow 1.5 solved this problem, these operator run faster now.
Release MobileNetV1 (3 time faster than Keras)
https://github.com/tensorlayer/tensorlayer/blob/master/example/tutorial_mobilenet.py
- Note that, TensorLayer MobileNet is 3 time faster than Keras (same weights and architecture).
- TensorLayer takes 0.001~0.002 second for one image on Titan XP.
- Keras takes 0.004~0.005 second for one image on Titan XP.
import keras
keras_model = keras.applications.mobilenet.MobileNet(input_shape=None, alpha=1.0, depth_multiplier=1, dropout=1e-3, include_top=True, weights='imagenet', input_tensor=None, pooling=None, classes=1000)
prob = keras_model.predict(np.asarray([img]), batch_size=1)
TODO MobileNetV2
Which weights? I'm getting bad predictions for 'tiger.jpeg' :
('Predicted :', [(u'n09472597', u'volcano', 0.20799732), (u'n03388043', u'fountain', 0.1508058), (u'n06874185', u'traffic_light', 0.12433123)])
@filipetrocadoferreira check the weights here https://github.com/tensorlayer/pretrained-models
I'm using mobilenet.npz