Support quantization

Question

Support quantization

Closed this issue 6 months ago · 0 comments

QNN requires specifying fp16 or quantized before execution. To support this generically we need to be able to query execution precision from the models. A proxy could be that if no floating point inputs or outputs exist to use quantized though this may not apply universally