FoundationVision/VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

PythonMIT

Issues

Conflicting Instructions for computing FID
#86 opened 13 days ago by Kumbong
0
FID of VQVAE
#85 opened 18 days ago by RohollahHS
2
generate images with arbitrary resolutions
#84 opened a month ago by Leiii-Cao
0
How can I train VAR with my own data? Does the num_class parameter in VAR have any effect?
#82 opened a month ago by zhangqingwu
1
Cannot reproduce results in Table 1, especially the IS score
#83 opened a month ago by adreamwu
1
About in-painting tasks
#74 opened 3 months ago by zhiyuanyou
1
class token replaces first token
#81 opened a month ago by cyw-3d
0
换个角度,ms codebbok是不是也可以等价于另一种latent diffusion?
#72 opened 4 months ago by YilanWang
1
How many GPU did you use for training?
#44 opened 5 months ago by daixiangzi
3
Can not align FID with provided checkpoint
#69 opened 4 months ago by LiCHH
6
Question about the training dataset - tokenzier
#80 opened 2 months ago by Lucky-Light-Sun
0
How can i train this model by custom dataset
#79 opened 2 months ago by morestart
1
Weird/Inconsistency evaluation IS score.
#77 opened 3 months ago by ChenDRAG
2
Image reconstruction via Transformer.
#55 opened 5 months ago by minimini-1
1
Why multi-scale features partially shared a convolution network via PhiPartiallyShared
#73 opened 3 months ago by sunset-clouds
1
Usage of classifier-free guidance
#76 opened 3 months ago by ChenDRAG
0
we use VAR-CLIP train a TextToImage model on ImageNet,The result seems pretty good.
#75 opened 3 months ago by daixiangzi
3
T2I Generation
#64 opened 4 months ago by ucasyjz
2
Computation and Memory Consumption When Training Models
#71 opened 4 months ago by walking-shadow
0
https://var.vision/demo is borked
#68 opened 4 months ago by yosun
1
for in/out painting
#70 opened 4 months ago by Youngwoo-git
0
Question about the cross-antropy loss average?
#53 opened 4 months ago by Yheechou
4
The 512 ckp?
#57 opened 4 months ago by FanqingM
2
the size of latent space
#65 opened 4 months ago by xinding64
1
Finetuning on own data
#66 opened 4 months ago by LetsGoFir
1
question about ema_vocab_hit_SV
#67 opened 4 months ago by shliu0
1
VQVAE training code
#63 opened 4 months ago by Junda24
0
the patch_nums of 256*256 image
#62 opened 4 months ago by xinding64
0
Inference code keeps generating the same image
#61 opened 4 months ago by moeinheidari7829
2
Training code for VQVAE
#60 opened 4 months ago by zhangjingze21
0
There was no increase in speed after installing flash-attn and xformer.
#59 opened 4 months ago by kongwanbianjinyu
0
Inference after training on own dataset
#58 opened 4 months ago by moeinheidari7829
2
training code for VQVAE
#49 opened 5 months ago by Junda24
2
Question on autoregressive_infer_cfg
#56 opened 4 months ago by sparse-mvs-2
2
请问下训练512x512分辨率的图像也使用16x16的codebook size吗
#54 opened 4 months ago by YilanWang
2
请问下两个阶段ablation的细节
#50 opened 5 months ago by YilanWang
3
大佬关于demo生成的图像的问题
#52 opened 5 months ago by BigConsin
1
Scalability to multimodal large language models?
#51 opened 5 months ago by DEBIHOOD
0
training with multi-gpu but stuck
#47 opened 5 months ago by Erisura
2
Can the training log be released?
#48 opened 5 months ago by daixiangzi
0
FID misalignment
#45 opened 5 months ago by ckczzj
4
class VQVAE forward function error
#46 opened 5 months ago by woldier
1
请问VQVAE(stage1)阶段是怎样使用多级VectorQuantizer的?
#42 opened 5 months ago by YilanWang
3
AdaLNSelfAttn.forward 请问下这句目的是什么
#43 opened 5 months ago by yanghu819
2
Abnormal sample results with `demo_sample.ipynb`
#41 opened 5 months ago by karrykkk
2
fid
#39 opened 5 months ago by 21157651
1
AR Time Complexity?
#35 opened 5 months ago by isaacrob
5
what is your position embeding ? is 2d RoPE good choice ?
#36 opened 5 months ago by renjingneng
1
About progressive training
#37 opened 5 months ago by ParanoidHW
3
FID on Class-Conditioned Evaluation & Normalized Attn in Depth 16/30
#38 opened 5 months ago by lxa9867
2