FoundationVision/LlamaGen
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
PythonMIT
Issues
- 11
- 0
About finetuning on pretrained LLM models.
#73 opened by ZeyuLing - 0
About `data_range` in ssim_loss
#72 opened by xyfJASON - 0
Why use such a big model---google/flan-t5-xl
#71 opened by ranck626 - 0
请问如何用我自己的数据集训练 Image tokenizers and AR models for text-conditional image generation?请问可以提供一个示例吗,谢谢
#59 opened by gzhuinjune - 4
KeyError: 'optimizer'
#45 opened by sugary199 - 9
你好,我使用您的代码训练了我自己的数据集,但是图片变得特别模糊
#29 opened by gzhuinjune - 0
dataloader的batch_size设置
#70 opened by yyyxcleo - 0
- 0
StyleGAN vs PatchGAN
#68 opened by sunset-clouds - 1
- 0
cfg-interval
#67 opened by wangyf8848 - 0
Embbeding layer
#66 opened by wangyf8848 - 0
How to reproduce the codebook usage
#65 opened by BaohaoLiao - 1
Why is the model GPT in the code?
#64 opened by wangyf8848 - 2
Question about cannot reproduce FID results
#53 opened by Ghy0501 - 0
The demo not work well
#63 opened by bigbrother001 - 1
T2I performance on mscoco
#47 opened by HalvesChen - 0
- 0
Recommendation for decoder finetuning
#61 opened by elias-ramzi - 0
Only inference
#60 opened by heavenhellchen - 0
When I set ipdb in gpt.py, I encounter this error,torch._dynamo.exc.InternalTorchDynamoError: `example_value` needs to be a `FakeTensor`wrapped by this instance of Dynamo. Found: tensor(..., device='meta', size=(2,))
#58 opened by BinZhu-ece - 0
About evaluation on private dataset
#55 opened by MrCrims - 6
About ROPE in sample process
#54 opened by Leedonus - 0
Loss increases during training the T2I model
#51 opened by Epiphqny - 2
- 0
add t5 extraction instructions in Readme or Getting started for t2i training
#52 opened by sahil02235 - 0
T2I VQVAE Training Details
#50 opened by alexanderswerdlow - 7
Mask guidance, inpaiting and outpaiting
#49 opened by sahil02235 - 1
[Feature] ControlNet support via process similar to PixArt's ControlNet-Transformer
#24 opened by kabachuha - 1
- 0
T2I Data
#37 opened by HalvesChen - 3
- 3
- 6
- 1
- 0
- 4
Training Results
#39 opened by Huage001 - 0
tokenizer of 4 dim
#40 opened by DidiD1 - 1
Questions about the discriminator
#35 opened by Doctor-James - 1
你好,vq_ds16_c2i_training.pt 在私有的通用图片上finetune效果变差了,就试了在imgenet数据集接着finetune,效果也变差了,想问一下是哪里出问题了?
#36 opened by 353xiong - 4
Discriminator is not training properly?
#27 opened by ThisisBillhe - 2
Inquiry about the OpenImages dataset
#32 opened by RobertLuo1 - 2
Text embedding inject
#33 opened by daiyixiang666 - 9
Question about why not try using image tokenizer and a ready made llama3 etc LLM model with lora?
#31 opened by lucasjinreal - 2
Question about text-conditional generation.
#30 opened by Yangr116 - 1
Which parameters are trainable? Are the encoder and decoder in VQGAN fixed? Is the llama fixed?
#26 opened by tanshuai0219 - 1
Train script
#28 opened by potatowarriors - 0
[Feature] Inpainting script
#25 opened by kabachuha - 2
Mismatched model weights document
#22 opened by Artanic30