/BinaryNet

Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1

Primary LanguageLua

Deep Networks on classification tasks using Torch

This is a complete training example for BinaryNets using Binary-Backpropagation algorithm as explained in "Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1, Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, Yoshua Bengio' on following datasets: Cifar10/100, SVHN, MNIST

Data

We use dp library to extract all the data please view installation section

Dependencies

To install all dependencies (assuming torch is installed) use:

luarocks install https://raw.githubusercontent.com/eladhoffer/DataProvider.torch/master/dataprovider-scm-1.rockspec
luarocks install cudnn
luarocks install dp
luarocks install unsup

Training

Create pre-processing folder:

cd BinaryNet
mkdir PreProcData

Start training using:

th Main_BinaryNet_Cifar10.lua -network BinaryNet_Cifar10_Model

or,

th Main_BinaryNet_MNIST.lua -network BinaryNet_MNIST_Model

Run with Docker

The Docker is built from nvidia/cuda:8.0-cudnn5-devel with Torch commit 0219027e6c4644a0ba5c5bf137c989a0a8c9e01b

  • To build image, run: docker build -t binarynet:torch-gpu-cuda-8.0 -f Dockerfile/binarynet-torch-gpu-cuda-8.0 . or to pull docker image: docker pull hychiang/binarynet:torch-gpu-cuda-8.0

  • To launch image with gpu, run: docker run -it --gpus all binarynet:torch-gpu-cuda-8.0

  • To train BNN with Cifar10: th Main_BinaryNet_Cifar10.lua -network BinaryNet_Cifar10_Model

Additional flags

Flag Default Value Description
modelsFolder ./Models/ Models Folder
network Model.lua Model file - must return valid network.
LR 0.1 learning rate
LRDecay 0 learning rate decay (in # samples
weightDecay 1e-4 L2 penalty on the weights
momentum 0.9 momentum
batchSize 128 batch size
stcNeurons true using stochastic binarization for the neurons or not
stcWeights false using stochastic binarization for the weights or not
optimization adam optimization method
SBN true use shift based batch-normalization or not
runningVal true use running mean and std or not
epoch -1 number of epochs to train (-1 for unbounded)
threads 8 number of threads
type cuda float or cuda
devid 1 device ID (if using CUDA)
load none load existing net weights
save time-identifier save directory
dataset Cifar10 Dataset - Cifar10, Cifar100, STL10, SVHN, MNIST
dp_prepro false preprocessing using dp lib
whiten false whiten data
augment false Augment training data
preProcDir ./PreProcData/ Data for pre-processing (means,Pinv,P)