thuanz123/enhancing-transformers

Results

snarb opened this issue · 5 comments

snarb commented

Hello. Thanks for your work.
What are training results and metrics that you get on any dataset with thus code?

Hi @snarb, sorry for late reply.

  1. For visualizing results, you can play around in this colab link
  2. For metrics to evaluate, I compare LPIPS score (perceptual loss), L2 loss (log_gaussian loss) and look at the visual quality of the images in the validation dataset between different runs
snarb commented

thnaks @thuanz123 . I have found another VQ-GAN implementation https://github.com/lucidrains/parti-pytorch/blob/main/parti_pytorch/vit_vqgan.py. Can you please tell what are there significant differences ? Looks like you have added ideas from [RQ-VAE]

  1. Lucidrains's implementation for ViT-VQGAN is very complex as he applies a lots of additional technique whereas my code is only plain simple ViT architecture. Also training vit-vqgan using his code is very slow compared to mine, and the image quality is not better much when I train for same number of iterations. This is may be bias since I haven't carefully try his code
  2. Except additional code for RQ-VAE, my implementation is the closest one to the author unpublic implementation since I ask him a lot of question.

Also a big different from lucidrain code is that my code support multi-node multi-gpu thanks to pytorch lightning, while his code seem not support this

I will close this for now. If you have any other question, feel free to reopen.