Issues
- 2
About the tT_loss
#63 opened by zzc681 - 7
- 0
- 0
Difficulty in running code
#72 opened by ajtheb - 8
about {path-to-diffusion-lm}
#34 opened by lcy5058 - 0
Seq2Seq tasks with Diffusion LM
#71 opened by chiral-carbon - 0
- 0
- 2
The relevant code that caused the error is in the Controllable Text Generation section, after the model trained for 6 epochs and started evaluating, it raised a KeyError: 'eval_loss'
#65 opened by Markkk111 - 1
- 0
E2E training procedure
#67 opened by elephantmipt - 0
Questions about the NLL loss
#66 opened by AlonzoLeeeooo - 2
- 0
The difference between this code and the paper "IDDPM" in the run_loop function in train_util.py.
#64 opened by xyz321123 - 0
- 0
error when runing:Exception in thread Thread-4:·······ValueError: signal number 32 out of range
#61 opened by Markkk111 - 0
Baseline reproduction
#60 opened by YANI-ALT - 1
- 2
Why not directly use Emb(W) as X_0?
#56 opened by leekum2018 - 2
- 0
- 0
- 3
- 0
Training on A100
#53 opened by mathematiguy - 2
- 2
What if DiffusionLM is initialized with BERT?
#40 opened by Hzfinfdu - 0
some problems on reproducing the results
#51 opened by arealgoodname - 0
Why only padding tokens are generated after a period of training, but no words?
#48 opened by greens007 - 0
Training Cost due to the EMA mechanism
#50 opened by Lancelot39 - 3
Do we need to scale word embeddings to [-1, 1]?
#49 opened by tj-zhu - 2
Losses for E2E Training
#47 opened by zanussbaum - 3
- 1
Where is the mbr.py file?
#43 opened by smiles724 - 2
- 1
- 5
- 0
Why model.model.module instead of model.model?
#44 opened by smiles724 - 2
How to train a new diffusion model & classifer with different diff_steps or embedding dimension?
#37 opened by ChorlingLau - 4
How did you derive your sampling algo?
#39 opened by jzhang38 - 1
Wandb log or Codalab log
#35 opened by ShilongYuan - 1
top_p parameter and scaling of timesteps
#38 opened by rabeeh-karimi - 4
- 13
Are these normal results?
#25 opened by ChorlingLau - 3
Error message when trying to train the model
#32 opened by JamesL404 - 1
The effect of "logp_term"
#27 opened by lgs00 - 1
How to control the length
#29 opened by gwang-kim - 1
- 1
train the diffusion model for sentence infilling
#26 opened by lwmlyy - 2
License
#23 opened by michaelrglass - 1
Strength of classifiers vs. results: do all baselines in the paper use the same classifier as Diffusion-LM?
#22 opened by jpilaul