microsoft/X-Decoder

Reopen Issue to Reclarify (unknown things/stuffs) : https://github.com/microsoft/X-Decoder/issues/12

ffahmed opened this issue · 2 comments

About this issue (#12 (comment))

I meant to say if we don't know what things and stuffs we want from a picture, I want to tell it to give me all possible things and stuffs in panoptic seg. How do I do it?

Please go back to our paper, and other vl detection papers and understand the problem in a more detailed way. Basically we do not have a class pool, with the support of language encoder, it enables any class you want. Thanks!

jwyang commented

@ffahmed what you asked is more like a dense or regional captioning task. but you can enumerate as many concepts as possible in the pool to mimic this behavior.