XiangLi1999/Diffusion-LM

Why not directly use Emb(W) as X_0?

leekum2018 opened this issue · 2 comments

Thanks for your nice work. I have a question and have difficulty understanding it, that is, why not directly use $Emb(W)$ as $X_0$, instead, $X_0 = Emb(W)+ N(0, \sigma_0 I)$ in the paper. Looking forward to your reply, thanks!

+1, I also have this question

FYI, this was discussed in openreivew. $\sigma_0$ is set to 0.0001 and it becomes spiky Gaussian, and it was empirical choice according to the authors.