model quantified slower than not quantifed on windows x64
imjking opened this issue · 1 comments
Issue Type
Performance
OS
Windows
OS architecture
Other
Programming Language
Python
Framework
TensorFlow, TensorFlowLite
Download URL for tflite file
https://storage.googleapis.com/mediapipe-assets/face_detection_short_range.tflite
https://storage.googleapis.com/mediapipe-assets/face_detection_full_range.tflite
Convert Script
tflite2tensorflow --model_path face_detection_short_range.tflite --flatc_path ../flatc --schema_path ../schema.fbs --output_pb
tflite2tensorflow --model_path face_detection_short_range.tflite --flatc_path ../flatc --schema_path ../schema.fbs --output_no_quant_float32_tflite --output_dynamic_range_quant_tflite --output_weight_quant_tflite --output_float16_quant_tflite --output_integer_quant_tflite
Description
hello, as the title say. i convert successfully the model, but the quant model is slower than original float32 model, and integer quant is slower than weight quant on windows x64. Do you know the reason? looking forward to your replay.
Relevant Log Output
inference time:
original float32 15ms
weight quant 150
integer quant 200
Source code for simple inference testing code
No response
It's too much trouble to answer in detail, so I'll just tell you the conclusion.
It is normal for them to slow down. Explore the Internet and GitHub issues to find the answer. I do not want to give the same answer over and over again.