Pretrained Stage 2 Transformer for ViT-VQGAN

Question

Pretrained Stage 2 Transformer for ViT-VQGAN

manuelknott opened this issue 2 years ago · 8 comments

Hi,

Thank you for the great work and especially for publishing the pre-trained models.

I was wondering: do you also plan to publish the weights of the autoregressive transformer (stage 2 training) of ViT-VQGAN?

Answer 1 · 2022-09-04T03:38:19.000Z

hi @manuelknott, the code for stage2 transformer is currently buggy so after I fixed everything, I will try to train and released a pretrained model. But this will be a long time later since I'm still learning about autoregressive modeling with transformers.

Answer 2 · 2022-09-05T08:37:57.000Z

Awesome! Already looking forward to it.

Answer 3 · 2022-09-07T01:57:55.000Z

Awesome! Already looking forward to it.

Then if you dont have any questions further, I will close this issue. Feel free to reopen

Answer 4 · 2022-10-05T21:42:42.000Z

Is there any update on this? I have seen there has been some modifications on the stage 2 code. If the bugs should be fixed, I can also try to train it on my own. Thank you!

Answer 5 · 2022-10-23T10:36:14.000Z

hi @manuelknott , sorry for the late reply, the issue should be fixed now and you can train a 2nd stage transformer now.

Answer 6 · 2022-10-26T21:38:26.000Z

Awesome, thank you. Just to be sure: the main.py file only covers stage 1 training for now, right? Is there any chance you could share the code for training stage 2 if it is not part of the repo yet?

Answer 7 · 2022-10-27T16:14:30.000Z

hi @manuelknott, main.py supports training stage2 model, you just need to use correct config. For example, you can refer to imagenet_gpt_vitvq_base.yaml to have a glimpse of stage2 config. Note that this example stage2 config is quite big and can only fit in at least 8 A100s so you might want to reduce the parameters

Answer 8 · 2022-10-27T18:09:16.000Z

Thanks for the explanation! I will try to train a model with my limited resources (2x A5000). Let's see how it goes. Do you plan to publish a pretrained stage 2 model anytime soon?