DETR VEA uses output of first layer of transformer decoder?

In

Line 131 in 694c606

    
           hs = self.transformer(src, None, self.query_embed.weight, pos, latent_input, proprio_input, self.additional_pos_embed.weight)[0]

and

act/detr/models/detr_vae.py

Line 136 in 694c606

    
           hs = self.transformer(transformer_input, None, self.query_embed.weight, self.pos.weight)[0]

you index the output of the transformer with [0]. Does this not take the output of the first layer of the transformer decoder, instead of the last layer? And is this behaviour expected?

Agree. I removed subsequent layers and it had no effect on training.

@CarlDegio hello,I changed 0 to -1 and used the output of the last layer, but there is no obvious difference in the effect. May I ask that your experiment is effective?

I did not try the effect of multi-layer decoder. I just removed the decoder forward propagation that was not used in the original code to speed up the training. @uuu686