About GroupNorm described in the MAGVIT V2 paper

Question

About GroupNorm described in the MAGVIT V2 paper

sen-ye opened this issue 7 months ago · 2 comments

Hello, thanks for your nice work. I notice that there are some differences between your implementations and original paper. One notable difference is the use of group normalization in the original paper. From my understanding, directly applying group normalization to a 5D video tensor (B, C, T, H, W) can result in non-causal behaviors. In your implementation, you did not include group normalization. Could you please explain your reasoning behind this choice? Is it related to the issue I mentioned?