FoundationVision/LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

PythonMIT

Issues

RuntimeError: shape '[1, 512, 1, 32, 2]' is invalid for input of size 16448
#21 opened 5 months ago by beniz
11
About finetuning on pretrained LLM models.
#73 opened 12 days ago by ZeyuLing
0
About `data_range` in ssim_loss
#72 opened 14 days ago by xyfJASON
0
Why use such a big model---google/flan-t5-xl
#71 opened 15 days ago by ranck626
0
请问如何用我自己的数据集训练 Image tokenizers and AR models for text-conditional image generation？请问可以提供一个示例吗，谢谢
#59 opened 3 months ago by gzhuinjune
0
KeyError: 'optimizer'
#45 opened 4 months ago by sugary199
4
你好，我使用您的代码训练了我自己的数据集，但是图片变得特别模糊
#29 opened 5 months ago by gzhuinjune
9
dataloader的batch_size设置
#70 opened 21 days ago by yyyxcleo
0
和AIM的区别
#69 opened 24 days ago by Jmh0527
0
StyleGAN vs PatchGAN
#68 opened 24 days ago by sunset-clouds
0
About train losses and evalution parameters setting
#56 opened 3 months ago by MrCrims
1
cfg-interval
#67 opened a month ago by wangyf8848
0
Embbeding layer
#66 opened a month ago by wangyf8848
0
How to reproduce the codebook usage
#65 opened a month ago by BaohaoLiao
0
Why is the model GPT in the code？
#64 opened a month ago by wangyf8848
1
Question about cannot reproduce FID results
#53 opened 4 months ago by Ghy0501
2
The demo not work well
#63 opened 2 months ago by bigbrother001
0
T2I performance on mscoco
#47 opened 4 months ago by HalvesChen
1
The effect of VQVAE's training data on image generation
#62 opened 2 months ago by HalvesChen
0
Recommendation for decoder finetuning
#61 opened 2 months ago by elias-ramzi
0
Only inference
#60 opened 2 months ago by heavenhellchen
0
When I set ipdb in gpt.py, I encounter this error,torch._dynamo.exc.InternalTorchDynamoError: `example_value` needs to be a `FakeTensor`wrapped by this instance of Dynamo. Found: tensor(..., device='meta', size=(2,))
#58 opened 3 months ago by BinZhu-ece
0
About evaluation on private dataset
#55 opened 4 months ago by MrCrims
0
About ROPE in sample process
#54 opened 4 months ago by Leedonus
6
Loss increases during training the T2I model
#51 opened 4 months ago by Epiphqny
0
Questions about the results of your experiment.
#38 opened 5 months ago by potatowarriors
2
add t5 extraction instructions in Readme or Getting started for t2i training
#52 opened 4 months ago by sahil02235
0
T2I VQVAE Training Details
#50 opened 4 months ago by alexanderswerdlow
0
Mask guidance, inpaiting and outpaiting
#49 opened 4 months ago by sahil02235
7
[Feature] ControlNet support via process similar to PixArt's ControlNet-Transformer
#24 opened 5 months ago by kabachuha
1
Cannot Reproduce LlamaGen-B or L numbers using provided models
#48 opened 4 months ago by vkramanuj
1
T2I Data
#37 opened 4 months ago by HalvesChen
0
FID results of GPT-L and GPT-1B on 256*256 images
#46 opened 4 months ago by LutingWang
3
FID Evaluation not matching paper results for VQ-16 checkpoint
#34 opened 4 months ago by vkramanuj
3
Can LlamaGen predict a [EOS] token when inferencing?
#44 opened 4 months ago by luminousking
6
Difficulty in reproducing results with pre-trained weights
#41 opened 5 months ago by Rishit-dagli
1
Do you try class 2 Image generation with the image resolution of 512X512?
#42 opened 5 months ago by OliverRensu
0
Training Results
#39 opened 5 months ago by Huage001
4
tokenizer of 4 dim
#40 opened 5 months ago by DidiD1
0
Questions about the discriminator
#35 opened 5 months ago by Doctor-James
1
你好，vq_ds16_c2i_training.pt 在私有的通用图片上finetune效果变差了，就试了在imgenet数据集接着finetune，效果也变差了，想问一下是哪里出问题了？
#36 opened 5 months ago by 353xiong
1
Discriminator is not training properly?
#27 opened 5 months ago by ThisisBillhe
4
Inquiry about the OpenImages dataset
#32 opened 5 months ago by RobertLuo1
2
Text embedding inject
#33 opened 5 months ago by daiyixiang666
2
Question about why not try using image tokenizer and a ready made llama3 etc LLM model with lora?
#31 opened 5 months ago by lucasjinreal
9
Question about text-conditional generation.
#30 opened 5 months ago by Yangr116
2
Which parameters are trainable? Are the encoder and decoder in VQGAN fixed? Is the llama fixed?
#26 opened 5 months ago by tanshuai0219
1
Train script
#28 opened 5 months ago by potatowarriors
1
[Feature] Inpainting script
#25 opened 5 months ago by kabachuha
0
Mismatched model weights document
#22 opened 5 months ago by Artanic30
2