Inference time is too slow

Question

Inference time is too slow

Congdinh1801 opened this issue 3 years ago · 3 comments

I trained the model from scratch using Cityscapes dataset and used it to run inference. My gpu is Quadro RTX 6000 96 GB.
I got 0.61 fps for inference time or 1637 ms for 1 image when it should takes 166 ms for 1 image according to the paper. Anyone got the same problem or know what could potentially cause my inference time too slow? Thanks.

Answer 1 · 2021-07-13T13:01:21.000Z

For the sake of simplicity, in this code-base we haven't integrated the optimization steps that we did for the paper such as faster depth-wise-separable convolution implementation from external repos, fusion step optimization, cudnn etc.
Still, the inference time is too high. Please make sure that no other process is running when you are computing the inference time (which looks like the most likely case for this high runtime). In my experience, additional overload significantly increases the runtime values.
Otherwise, to help you in debugging it would be nice if you could report the run-times of different parts of the network (with CUDA_LAUNCH_BLOCKING=1)
Additionally, refer our paper on how we compute the runtime. With this code-base, you should be able to get a runtime of 200-250 ms without any problem.

Answer 2 · 2021-08-18T15:28:12.000Z

Hi @Congdinh1801 , have you already been able to investigate what slows down your inference time? I have exactly the same problem and the same inference time as you - around 1700ms on average for 1 image, while I do not have any other processes running at the same time. I am running on a Jetson Xavier 32GB

Answer 3 · 2021-08-18T17:34:26.000Z

@95maddycode no I haven't been able to improve the run time :(