[circle2circle] Introduce a pass to simulate mixed-precision operators
jinevening opened this issue · 2 comments
jinevening commented
What
Let's introduce a new pass RemoveQDQForMixedPrecisionOp
Why
When we make a fake-quantized model, sometimes duplicate QDQ(Quantize-Dequantize) patterns appear as below.
In the above example, the first QDQ is for q8, and the second QDQ is for q16. For some backends, FC layer can directly generate q16 output even though its inputs are q8 (for higher accuracy). This is often called 'mixed-precision operator'.
To simulate the behavior of mixed-precision operator, we need a pass to remove the first QDQ pattern in the above pattern.
jinevening commented
I used [circle2circle] tag, because I'm not sure it is ok to expose this option to users (one-optimize
).
jinevening commented
Done