DETR VEA uses output of first layer of transformer decoder?
Opened this issue · 3 comments
KamilDre commented
CarlDegio commented
Agree. I removed subsequent layers and it had no effect on training.
uuu686 commented
@CarlDegio hello,I changed 0 to -1 and used the output of the last layer, but there is no obvious difference in the effect. May I ask that your experiment is effective?