microsoft/X-Decoder

Text Encoder src code

trqminh opened this issue · 1 comments

Hi,
Thank you for your consideration.
Could you show me where I could find this part in the code?
image

Thank you.

For grounding:

output = output[:-len(_grounding_tokens)]

For Captioning:
output = torch.cat((output, _caping_lang_embed), dim=0) # concat object query, class token and caption token.