how to do the quantization about inputs and weights ?
Closed this issue · 14 comments
first thanks for you paper.
I want to know ,how to quantization the weights and input about about the result ??
can you tell me some code about quantization ?
Hi @victorygogogo
The code for the quantization phase (learning codebooks and assignments) cannot be released at the moment. You may present your detailed questions here for further discussion.
I training a model with float type.
I want to use 8bit quantization way to improve my reference time .
so the weights ,bias ,and input should be quantizated ,
there is a problem ,how about the scale for the result ?
so ,I see your paper,how do you solve the problem ???
8-bit quantization with weights, biases, and inputs quantized? Do you mean uniform quantization, where possible quantization values are [-k, -k+1, ..., -1, 0, 1, ..., k - 1, k]? If so, you may refer to DoReFa-Net which suits your problem better.
Our approach uses non-uniform quantization, and only weights are quantized (biases and inputs are not quantized).
only weights are quantized ?
if only weights quantized ,how to speedup the reference time ?
for example ,how to speedup the gemm conv ??
Sorry for the late reply.
In our work, only weights are quantized. During the test phase, the matrix multiplication is converted into a series of table look-up operations (please refer to our paper for details). This results in a reduction in FLOPs.
can you tell where to check the matrix multiplication with table lookup from your code ??
For convolutional layers, please refer to:
void CaffeEva::CalcFeatMap_ConvAprx(...)
For fully-connected layers, please refer to:
void CaffeEva::CalcFeatMap_FCntAprx(...)
@jiaxiang-wu
thank you!
how to do the weights quantization ?
and how to do the LUT table ?
-
Do you mean how to obtain the D (sub-codebooks) & B (sub-codeword assignments) matrices for each layer? The training code for these two is not included in this repository, and you need to implement it by yourself, under the guidance of our CVPR paper.
-
Forward computation with look-up tables during the test phase is included in these two functions:
- CalcFeatMap_ConvAprx()
- CalcFeatMap_FCntAprx()
ok !
do you think about open all the source code ?
We do not have such plan for the moment. Sorry.
is the whole code released ?
@wuzhiyang2016 No, only the inference code is released.