some questions about quantization in TensorFlow

Question

rthenamvar opened this issue 2 years ago · 0 comments

I've read through the official guide and ran into problems understanding some concepts:

Is it possible to use Quantization Aware Training and not convert the model to a TF Lite model at the end?
Can I change the framework's default of 8-bit quantization? In the official document 4-bit and 16-bit quantizations were mentioned as experimental meaning the models cannot be converted to TF lite models. But isn't it possible to use the models without converting them to TF Lite models?

Thanks