Optimal Ternary Weights Approximation

Caffe implementation of Optimal-Ternary-Weights-Approximation in "Two-Step Quantization for Low-bit Neural Networks" (CVPR2018).

Objective Function

where and .

Weight Blob

We use a temporary memory block to store and keep in the this->blobs_[0]. During the backwardpropagation, was used in the gradient accumulation and was used in the calculation of bottom gradients.

How to use ?

change type: "Convolution" into type: "TernaryConvolution", e.g.

layer {
    bottom: "pool1"
    top: "res2a_branch1"
    name: "res2a_branch1"
    type: "TernaryConvolution"
    convolution_param {
        num_output: 64
        kernel_size: 1
        pad: 0
        stride: 1
        weight_filler {
            type: "msra"
        }
        bias_term: false
    }
}

So far, GPU only.

2-bit Activation Quantization

Please refer to wps712.

CAS-CLab/Optimal-Ternary-Weights-Approximation

Optimal Ternary Weights Approximation

Objective Function

Weight Blob

How to use ?

2-bit Activation Quantization