hkchengrex/STCN

Multi-view object segmentation

Closed this issue · 2 comments

Hi,
first of all, congratulations on your excellent work.
I wanted to know if this solution can also be used in a context where there are several images of the same object from multiple angles (not necessarily in order) and therefore not a video of the same.

Are there any theoretical bases that motivate this or some experiments done?

Thanks
Gianluca

That would be interesting but we haven't done any experiments on that. Technically it can "kind of" work because we are not explicitly using temporal smoothness. It might help to use some heuristic to sort the images to minimize the change in appearance between frames and feed the frames as a video. Note that the memory frames are an unordered set. Thus, let's say you have a bunch of images of taken at different azimuths (represented by numbers), you can represent them as different types of sequences:
5->6->7->8->4->3->2->1
5->4->3->2->1->6->7->8
1->2->3->4->5->6->7->8
5->6->7->8 AND 5->4->3->2->1 (as two separate sequences)
with mask propagated from the first frame in the sequence. They should all kind of work -- you can try for yourself on your data.

While something like 1->4->8->... would be bad.
Depending on the density of images that you have, you might also want to decrease the mem_every parameter.

Thank you very much! I'll do some tests!