How to deploy the quantized model?

Question

How to deploy the quantized model?

Opened this issue 3 years ago · 5 comments

jiinhui commented 3 years ago

When we have trained the quantization model, how to deploy it?

Answer 1 · 2021-11-01T08:14:38.000Z

use original model with strict=False

Answer 2 · 2021-11-15T09:36:10.000Z

use original model with strict=False

I mean it is different with the official interface about the Quantization Aware Training(https://pytorch.org/docs/stable/quantization.html),I don't know how to get the int8 model from the trained model.

Answer 3 · 2021-11-15T09:42:47.000Z

not sure what int8 model means.

anyway, i use this method.
at model init part,

during training
self.conv1 = lsqconv(~~)

during inferencing
self.conv1 = nn.Conv2d(~~)

name "self.conv1" does not changed and weights/ bias shapes are same between them. so it works.

i wrote "using strict=false" because if not, it causes error when using upper codes.

Answer 4 · 2021-11-15T10:57:19.000Z

not sure what int8 model means.

anyway, i use this method. at model init part,

during training self.conv1 = lsqconv(~~)

during inferencing self.conv1 = nn.Conv2d(~~)

name "self.conv1" does not changed and weights/ bias shapes are same between them. so it works.

i wrote "using strict=false" because if not, it causes error when using upper codes.

int8 means 8 bit , which is quantized from the float model (32 bit)

Answer 5 · 2023-09-02T12:19:56.000Z

Hello,请问你的这个问题后来弄清楚了吗，我刚接触，现在也想知道怎么得到int8模型