tensorflow/model-optimization

Input and resource quantization

Jwy-Leo opened this issue · 1 comments

System information

  • TensorFlow version (you are using): tensorflow 2.9.1
  • Are you willing to contribute it (Yes/No): No

Motivation
The input and resource need interface for customer layer quantization,
Normalization is a basic structure in the DNN, but we cannot quantize it easily.

  1. No input quantization interface in customer configuration, it will make quantization mul-add structure in float32 instead of int8 in tflite file.
  2. No resource quantization interface in customer configuration, it will make batchnorm - (running mean, running variance) wo/ QAT.
Xhark commented

Hi, would you please give us some examples? We usually assume BNs would be folded (fused) to nearby layer for optimization. I'd like to know some use-cases that when it required. Thanks.