Yxxxb/VoCo-LLaMA

How to compare the inference time?

Closed this issue · 4 comments

Hi, authors. I wonder how to present the efficiency via inference time.

Besides, I want to learn how to compute the CUDA time. Thanks a lot.

Hi,

llama.cpp, LLaVA-cli or simple time function of Python can complete the measurement of time.

@Yxxxb What batch size do you use?

The overall batch size is 128, which is the same as LLaVA SFT stage. You could check the training Hyperparameters in the "Additional Implement Details" section of our paper's appendix.