Why is magvitv2 different from the description in the paper? Am I understanding it wrong?
Opened this issue · 6 comments
Why is magvitv2 different from the description in the paper? Am I understanding it wrong?
Yes. I implemented it described in the paper as much as possible. The effect looks normal
Hello, I'm interested in how you implement the group norm in the MAGVIT-V2 paper. Did you directly apply group norm to a video tensor?
Please refer directly to Tencent’s implementation: https://github.com/TencentARC/Open-MAGVIT2
Thanks~
Please refer directly to Tencent’s implementation: https://github.com/TencentARC/Open-MAGVIT2
This implementation still seems to be large different from the original paper, and it only use a small model to train an image tokenizer. As you said you implemented it, how about the evaluation results? i.e. imagenet/ucf101 reconstruction? Is there any chance to communicate with you? @hefeicyp
Please refer directly to Tencent’s implementation: https://github.com/TencentARC/Open-MAGVIT2
This implementation still seems to be large different from the original paper, and it only use a small model to train an image tokenizer. As you said you implemented it, how about the evaluation results? i.e. imagenet/ucf101 reconstruction? Is there any chance to communicate with you? @hefeicyp
Hey, we've implemented a version that's almost perfectly aligned with the original paper. You can check it for more details. https://github.com/cofe-ai/O2-MAGVIT2