Set of referring image segmentation queries
Closed this issue · 1 comments
Thanks for your interesting work!!
I cannot get the construction details of the initial text queries for referring image segmentation.
From my understanding, open-vocab segmentation uses a set of input text queries and makes your recurrent filtering of non-existing concept texts necessary. However, since referring image segmentation uses a pair of an image and a text as input, I cannot understand how CaR eliminates the irrelevant texts recurrently. Therefore, my short knowledge can be filled by knowing the initial text queries for this task.
If the detail has existed on the paper, I would be sorry to ask about it, and excuse me, please.
Best regards,
Namyup Kim.
Hi Namyup,
Thank you for your kind interests! For referring segmentation we do not filter the irrelevant texts out so all results can be obtained in just one go. We have released all code at:
https://github.com/google-research/google-research/tree/master/clip_as_rnn
Please check it out!