Why batch size could only be 1?
aiiph4 opened this issue · 2 comments
aiiph4 commented
Hi, congrats on your great work!
I wonder why batch size could only be set to 1. It seems to me that controller could store attention map for a larger batch. Is this due to the memory cost or any other reason?
zwcolin commented
Hi,
Thanks for reaching out! Different examples have different cross-attention map indices (e.g., embeddings from noun tokens) to calculate grounding objectives, therefore it requires some engineering effort to achieve batched training. We simply did not implement it as a research project, but batched training is engineering-wise feasible.
Let us know if you have any other questions.
Best,
Zirui
aiiph4 commented
Thanks for you reply!