hustvl/PD-Quant

Why the quantizated model use the float weights during the inference stage?

qifeng22 opened this issue · 6 comments

x_quant = torch.clamp(x_int + self.zero_point, 0, self.n_levels - 1)
x_float_q = (x_quant - self.zero_point) * self.delta

@jiawei-liu1103
This is Class AdaRoundQuantizer(nn.Module)

Hi, when the reconstruction is completed, we will set the "soft_targets" of weight quantization to be False, so "x_int" and is the value after quantization. "x_float_q" is indeed a floating-point number but it is a value after dequantization, this operation is called fake quantize.

Does the inference accuracy fake quantize equal to the real on-device or int quantization? If yes, Have any reference about the theory analyze of fake quantization and real quantization?

The process of fake quantize is "float->int->float", usually the first step "float->int" already simulates the loss caused by quantization (truncation error, rounding error). However, if the real hardware only supports operations between integers, the results of fake quantize should be better than that of pure integer quantization. You can just google the terms "fake quantize" and "dequantize" to understand fake quantize and real quantize.

okok, thank you