[circle2circle] Introduce a pass to simulate mixed-precision operators

Question

[circle2circle] Introduce a pass to simulate mixed-precision operators

jinevening opened this issue 5 months ago · 2 comments

What

Let's introduce a new pass RemoveQDQForMixedPrecisionOp

Why

When we make a fake-quantized model, sometimes duplicate QDQ(Quantize-Dequantize) patterns appear as below.

In the above example, the first QDQ is for q8, and the second QDQ is for q16. For some backends, FC layer can directly generate q16 output even though its inputs are q8 (for higher accuracy). This is often called 'mixed-precision operator'.

To simulate the behavior of mixed-precision operator, we need a pass to remove the first QDQ pattern in the above pattern.

jinevening commented 5 months ago

Done

Answer 1 · 2024-05-03T08:18:37.000Z

I used [circle2circle] tag, because I'm not sure it is ok to expose this option to users (one-optimize).