/Stochastic-Quantization

Training Low-bits DNNs with Stochastic Quantization

Primary LanguageJupyter Notebook

Stochastic-Quantization

Introduction

This repository contains the codes for training and testing Stocastic Quantization described in the paper "Learning Accurate Low-bit Deep Neural Networks with Stochastic Quantization" (BMVC 2017, Oral).

We implement our codes based on Caffe framework. Our codes can be used for training BWN (Binary Weighted Networks), TWN (Ternary Weighted Networks), SQ-BWN and SQ-TWN.

Usage

Build Caffe

Please follow the standard installation of Caffe.

cd caffe/
make
cd ..

Training and Testing

CIFAR

For CIFAR-10(100), we provide two network architectures VGG-9 and ResNet-56 (See details in the paper). For example, use the following commands to train ResNet-56:

  • FWN
./CIFAR/ResNet-56/FWN/train.sh
  • BWN
./CIFAR/ResNet-56/BWN/train.sh
  • TWN
./CIFAR/ResNet-56/TWN/train.sh
  • SQ-BWN
./CIFAR/ResNet-56/SQ-BWN/train.sh
  • SQ-TWN
./CIFAR/ResNet-56/SQ-TWN/train.sh

ImageNet

For ImageNet, we provide AlexNet-BN and ResNet-18 network architectures. For example, use the following commands to train ResNet-18:

  • FWN
./ImageNet/ResNet-18/FWN/train.sh
  • BWN
./ImageNet/ResNet-18/BWN/train.sh
  • TWN
./ImageNet/ResNet-18/TWN/train.sh
  • SQ-BWN
./ImageNet/ResNet-18/SQ-BWN/train.sh
  • SQ-TWN
./ImageNet/ResNet-18/SQ-TWN/train.sh

Implementation

Layers

We add BinaryConvolution, BinaryInnerProduct, TernaryConvolution and TernaryInnerProduct layers to train binary or ternary networks. We also put useful functions of low-bits DNNs in lowbit-functions.

Params

We add two more parameters in convolution_param and inner_product_param, which are sq and ratio. sq means whether to use stochastic quantization (default to false). ratio is the SQ ratio (default to 100).

Note

Our codes can only run appropriately on GPU. CPU version should be further implemented.

Have fun to deploy your own low-bits DNNs!