TRI-ML/vlm-evaluation

Slow model inference when evaluation

Closed this issue · 1 comments

Thanks for your wonderful work, which helps me a lot.

When using this tool, I met a very slow model inference. It will take a long time to do the evaluation especially on vqa-v2-full dataset with 214354 samples.

I noticed that the batch_size is fixed to 1 due to the bug in HF and cannot be increased.

# Inference Parameters
device_batch_size: int = 1 # Device Batch Size set to 1 until LLaVa/HF LLaMa fixes bugs!

I am wondering whether there is any other way to speed up the inference on evaluation.

We’re working on adding robust support for batched inference to speed up workloads like this one. Unfortunately, VQA-v2 just has a massive validation set — if you’re iterating on experiments, I’d consider trying the “subsampled” variant of the dataset (16K examples), and only running out the full 200K for getting final numbers!