Implementation details of view transformer
Zhentao-Liu opened this issue · 3 comments
Zhentao-Liu commented
In the provided code, attn = k - q[:,:,None,:] + pos, attn = self.attn_fc(attn). However, in Fig. 2.a and alg.1, there should not be self.attn_fc component. Could you give an explanation?
Zhentao-Liu commented
This part code is in transformer_network.py class Attention2D
Zhentao-Liu commented
In Eq 9, what do you mean by applying diag(.)
MukundVarmaT commented
Hi @Zhentao-Liu!
Thank you for pointing it out! Yes, there is an error in our pseudo-code in algorithm. 1 (although fa(.) was defined we never used it). However, our implementation details (in text) do discuss the same (Appendix. B - Memory-efficient Cross-View Attention).