how to do the quantization about inputs and weights ?

Question

how to do the quantization about inputs and weights ?

Closed this issue 7 years ago · 14 comments

first thanks for you paper.
I want to know ,how to quantization the weights and input about about the result ??

can you tell me some code about quantization ?

Answer 1 · 2018-06-01T01:08:24.000Z

Hi @victorygogogo
The code for the quantization phase (learning codebooks and assignments) cannot be released at the moment. You may present your detailed questions here for further discussion.

Answer 2 · 2018-06-01T09:24:17.000Z

I training a model with float type.
I want to use 8bit quantization way to improve my reference time .
so the weights ,bias ,and input should be quantizated ,
there is a problem ,how about the scale for the result ?

so ,I see your paper,how do you solve the problem ???

Answer 3 · 2018-06-02T05:08:58.000Z

8-bit quantization with weights, biases, and inputs quantized? Do you mean uniform quantization, where possible quantization values are [-k, -k+1, ..., -1, 0, 1, ..., k - 1, k]? If so, you may refer to DoReFa-Net which suits your problem better.

Our approach uses non-uniform quantization, and only weights are quantized (biases and inputs are not quantized).

Answer 4 · 2018-06-04T05:23:33.000Z

only weights are quantized ?

if only weights quantized ,how to speedup the reference time ?

for example ,how to speedup the gemm conv ??

Answer 5 · 2018-06-08T02:08:50.000Z

Sorry for the late reply.

In our work, only weights are quantized. During the test phase, the matrix multiplication is converted into a series of table look-up operations (please refer to our paper for details). This results in a reduction in FLOPs.

Answer 6 · 2018-06-08T06:25:54.000Z

can you tell where to check the matrix multiplication with table lookup from your code ??

Answer 7 · 2018-06-09T02:41:40.000Z

For convolutional layers, please refer to:
void CaffeEva::CalcFeatMap_ConvAprx(...)

For fully-connected layers, please refer to:
void CaffeEva::CalcFeatMap_FCntAprx(...)

Answer 8 · 2018-06-11T05:37:50.000Z

@jiaxiang-wu
thank you!

Answer 9 · 2018-06-12T05:50:48.000Z

how to do the weights quantization ?
and how to do the LUT table ？

Answer 10 · 2018-06-13T02:22:49.000Z

Do you mean how to obtain the D (sub-codebooks) & B (sub-codeword assignments) matrices for each layer? The training code for these two is not included in this repository, and you need to implement it by yourself, under the guidance of our CVPR paper.
Forward computation with look-up tables during the test phase is included in these two functions:

CalcFeatMap_ConvAprx()
CalcFeatMap_FCntAprx()

Answer 11 · 2018-06-14T05:13:40.000Z

ok !
do you think about open all the source code ?

Answer 12 · 2018-06-17T03:32:44.000Z

We do not have such plan for the moment. Sorry.

Answer 13 · 2019-08-22T12:55:51.000Z

is the whole code released ?

Answer 14 · 2019-08-28T00:41:56.000Z

@wuzhiyang2016 No, only the inference code is released.