There seems to be a bug at datasets/bases.py#L157.
Opened this issue · 1 comments
zhixhan commented
At datasets/bases.py#L157, you directly pass caption_tokens to the function _build_random_masked_tokens_and_labels and caption_tokens has be modified in this function. Thus, the masked captions are also used in the sdm task and id task, which is inconsistent with the clarification in the paper.