How to measure inference time

Question

How to measure inference time

re-ag opened this issue 3 years ago · 0 comments

Hi, thanks for your great work.

I want to compare the inference time after compressing MobileNet. I measured with the batch size set to 1 with the code snippet below, is this the correct way? (eval_mobilenet.py)

start, end = torch.cuda.Event(enable_timing=True), torch.cuda.Event(enable_timing=True)
start.record()
outputs = net(inputs) # forward
end.record()
torch.cuda.synchronize()
curr_time = start.elapsed_time(end)

When measured, the inference time of MobileNet before and after compression was about 2.9 ms. This is average of curr_time. The value is much larger than the inference time of the paper(0.4 ms), so I want to know how the inference time was measured.

thank you.