Caffe implementation of Optimal-Ternary-Weights-Approximation in "Two-Step Quantization for Low-bit Neural Networks" (CVPR2018).
We use a temporary memory block to store and keep in the this->blobs_[0]. During the backwardpropagation, was used in the gradient accumulation and was used in the calculation of bottom gradients.
change type: "Convolution" into type: "TernaryConvolution", e.g.
layer {
bottom: "pool1"
top: "res2a_branch1"
name: "res2a_branch1"
type: "TernaryConvolution"
convolution_param {
num_output: 64
kernel_size: 1
pad: 0
stride: 1
weight_filler {
type: "msra"
}
bias_term: false
}
}
So far, GPU only.
Please refer to wps712.