Why are you disabling network output quantization?

Question

Why are you disabling network output quantization?

Opened this issue 10 months ago · 1 comments

Hi, could you please give more details why are you disabling network output quantization?
https://github.com/hustvl/PD-Quant/blob/main/main_imagenet.py#L212

The other question is about Setting the first and the last layer to 8-bit. Why is that?

https://github.com/hustvl/PD-Quant/blob/main/main_imagenet.py#L210

Answer 1 · 2024-01-14T09:13:41.000Z

The 8-bit quantization of the first and last layers and the non-quantization of the output of the model are considered based on the actual situation of the hardware. If the first and last layers are quantized to low bits, a relatively large performance degradation will occur.