Detach for text
vateye opened this issue · 3 comments
Hi, I am quite confused about the loss computation. For computing the loss for learnable queries, I saw the text features are detached and thus will not be computing the gradient.
X-Decoder/xdecoder/body/decoder/xdecoder.py
Line 230 in fca01f6
Hi,
Text features are not detached for all the settings, they detach on output (per-layer) but attach on query_emb. This is an empirical design choice.
So, during the training on the tasks related to learnable queries (e.g., segmentation, grounding), the text features are always detached?
Nope, please go back to the code:
X-Decoder/xdecoder/body/decoder/xdecoder.py
Line 233 in fca01f6
Query embedding is attached.