Issues
- 3
Increasing token size
#71 opened by jinnan-chen - 1
Reproduce the results of Table 1
#72 opened by Doraemonzzz - 3
noise_schedule setting
#70 opened by wangherr - 3
Loss is nan, stopping training
#69 opened by tau-yihouxiang - 10
Doesn't work well in speech generation task.
#35 opened by FacePoluke - 7
- 2
- 2
- 1
Differences between Latent Space and Pixel Space
#67 opened by OK932001 - 1
MAR used for image restoration
#65 opened by chengliang0109 - 1
Request for Your Email Address
#64 opened by ziwei-cui - 4
How is 0.2325 calculated?
#62 opened by shuowang666 - 2
About diffloss
#63 opened by RohollahHS - 1
image of the training loss change in TensorBoard
#61 opened by wangherr - 6
- 1
- 2
- 1
mask generation for training and inference with MAR
#57 opened by YOU-k - 6
Clarification of speed
#45 opened by zehongs - 10
About VAE channels
#56 opened by pokameng - 6
Questions about causal methods
#27 opened by Tom-zgt - 2
About the positional encodings of diffusion MLP.
#54 opened by whwjdqls - 1
The influence of VAE feature dim
#53 opened by Tom-zgt - 2
About training with cached vae latents
#52 opened by RohollahHS - 3
Challenges in Memorizing Single or Few Images
#49 opened by kifarid - 2
About Encoder and Decoder in MAE
#42 opened by RohollahHS - 31
About Train
#36 opened by pokameng - 5
Faster training with fp16 or bf16
#29 opened by shaochenze - 3
Request for Causal AR Version Release
#39 opened by aengusng8 - 1
- 18
Reproducing the BASE model.
#48 opened by cxxgtxy - 5
model and training code for the AR variant
#34 opened by MikeWangWZHL - 2
Dataset used for training...
#50 opened by Nobody-Zhang - 2
Reconstruction loss in ELBO
#51 opened by Paulmzr - 1
Small Issue/error in the code
#47 opened by niklasbubeck - 14
Add HF integration to MAR
#32 opened by jadechoghari - 1
About masking ratio
#43 opened by RohollahHS - 1
Training Problems
#44 opened by drx-code - 2
Unable to calculate FID
#41 opened by xbyym - 5
Difference Between MAR and MAGE
#30 opened by JeremyCJM - 2
CFG for cross-entropy
#38 opened by shaochenze - 4
About Training Loss
#37 opened by Ferry1231 - 1
Is Autoencoder ok?
#33 opened by Ferry1231 - 9
The CFG strategy - linear. vs constant
#31 opened by yuhuUSTC - 4
generate images with arbitrary resolutions,
#28 opened by Leiii-Cao - 1
Inference details of an ablation experiment.
#26 opened by tgxs002 - 2
VAE decoded as NaN in early stages of training
#25 opened by xiazhi1 - 2
Why main_cache do not use flip augment?
#24 opened by xiazhi1 - 2
How should inference be performed when using VQ-16 (discrete)? During decoding, should we use the AR output for VQ and then decode?
#23 opened by Tom-zgt - 4
About the mask schedule during training
#22 opened by zythenoob