Why batch size could only be 1?

Question

Why batch size could only be 1?

aiiph4 opened this issue 10 months ago · 2 comments

Hi, congrats on your great work!

I wonder why batch size could only be set to 1. It seems to me that controller could store attention map for a larger batch. Is this due to the memory cost or any other reason?

Answer 1 · 2024-01-10T02:48:46.000Z

Hi,

Thanks for reaching out! Different examples have different cross-attention map indices (e.g., embeddings from noun tokens) to calculate grounding objectives, therefore it requires some engineering effort to achieve batched training. We simply did not implement it as a research project, but batched training is engineering-wise feasible.

Let us know if you have any other questions.

Best,
Zirui

Answer 2 · 2024-01-11T03:30:31.000Z

Thanks for you reply!