davidnvq/grit

How to generate more captions

victorup opened this issue · 4 comments

Hello,

I would like to generate multiple candidate captions for one image when doing image captioning. How could I do this? Is there any parameter I can set?

Thanks!

I'm sorry for the late reply. Indeed, only one caption is generated by default.
In order to generate multiple captions per image, you need to put some effort with Beamsearch.
Please help yourself first with your implementation.

image

Hi @davidnvq, I want to increase the length of the generated caption, do I need to train it again or do you have any idea, Please, thanks

@taruntiwarihp Sorry for missing your question. I suppose that you can suppress logits during the generation. For example, whenever it outputs token and then generated caption is still short, you can select the token with second highest logit value.
However, it can't guarantee better captions as the model is expected to generate good captions having the similar distribution as the training captions. Therefore, I believe that in order to get high quality and lengthy captions, the model should be trained/finetuned again on the similar training dataset.

@victorup
Hello, have you implemented generating multiple captions