How to measure inference time
re-ag opened this issue · 0 comments
re-ag commented
Hi, thanks for your great work.
I want to compare the inference time after compressing MobileNet. I measured with the batch size set to 1 with the code snippet below, is this the correct way? (eval_mobilenet.py
)
start, end = torch.cuda.Event(enable_timing=True), torch.cuda.Event(enable_timing=True)
start.record()
outputs = net(inputs) # forward
end.record()
torch.cuda.synchronize()
curr_time = start.elapsed_time(end)
When measured, the inference time of MobileNet before and after compression was about 2.9 ms. This is average of curr_time
. The value is much larger than the inference time of the paper(0.4 ms), so I want to know how the inference time was measured.
thank you.