Why does quantified model run slower?

Question

Why does quantified model run slower?

Opened this issue 5 years ago · 4 comments

I know these models are designed to run in Coral USB but should't also it run faster in a PC?

It takes about 1.15 seconds to run a tiny-yolov3.tflite in my coumputer but arround 15 seconds to run the quant_coco-tiny-v3-relu.tflite

Is this normal behaviour?
will it run faster than 1.15s in Coral USB TPU when I buy it?

Thanks for answering!

Answer 1 · 2020-04-09T10:30:24.000Z

I don't know the reason, but I also got a bad performance when running the quantized model on the desktop CPU. However, it runs fine/as expected with the Edge TPU.

Answer 2 · 2021-01-13T11:39:59.000Z

@guichristmann What kind of inference speeds did you get with YOLOv3-tiny? In terms of FPS?

Answer 3 · 2021-08-21T00:55:33.000Z

@oroelipas @parthjdoshi @guichristmann Hi. I could convert my yolov3 model to quantized tflite model. But when I try to run the inference.py script. I do not see any detection happening on the input image.
Were you able to get it doneusing this platform? Did you make any changes in the script?
Please do help me get this run. Thank you in advance guys.

Answer 4 · 2021-10-27T20:03:20.000Z

@parthjdoshi I get around 16-17 fps with a yolov3-tiny model with relu as the activation function on device with edge tpu