Curiosity of ECLIPSE's network architecture
Closed this issue · 4 comments
Hi! May I ask why you adopted the Mask2Former architecture? Is it because of some disadvantage of Mask2Former in continual learning
Hi.
There are several reasons why we adopted Mask2Former.
- Mask2Former is one of the representative and widely used segmentation models recently.
- To apply the visual prompt tuning to continual panoptic segmentation, Mask2former is the most suitable transformer-based architecture.
- Mask2Former is powerful and supports universal image segmentation (including panoptic, semantic, and instance segment tasks).
My apologies for my poor English. I did not mean to ask this question. The question I really want to ask is:
Why does ECLIPSE change the Mask2Former architecture? In the original Mask2Former design, there are some connections between the pixel decoder and the transformer. However, in ECLIPSE, these connections from the pixel decoder are removed and changed to image embeddings from the backbone output. May I ask why you changed the Mask2Former architecture? Is it because of some disadvantage of Mask2Former in continual learning?
The above information is taken from pictures of the Mask2Former and ECLIPSE design. If what I have written above is incorrect, please correct me.