whether the training of EVA involves masking text (caption) token?

Question

whether the training of EVA involves masking text (caption) token?

Closed this issue a year ago · 1 comments

I am new to this area. Just want to check that whether the training of EVA model involves masking text (caption) token, or the training of EVA model only involves masking image patches.
Thank you so much for your help.

Answer 1 · 2024-01-14T15:44:02.000Z

@leyangjin only masking image patches.