lucidrains/magvit2-pytorch

Question about casual 3d cnn

eii-lyl opened this issue · 1 comments

Hi. Thank you for your excellent job. I have a question of the casual 3d cnn.
From my point of view, if we use casual 3d cnn, then we dont need to use transformer. Transformer is only used in c-vivit. But in the code I saw linear_attend_space and attend_space .
Is my understanding wrong?

@eii-lyl yea, just don't use the attention layers to align with the paper

i simply have those blocks in there because i believe in attention