GENIE: decoder_nll loss is always equal to 0 for training from scratch

Question

GENIE: decoder_nll loss is always equal to 0 for training from scratch

BaohaoLiao opened this issue 2 years ago · 2 comments

I try to reproduce your reported result of the training from scratch for XSum. However, the decoder_nll loss is always equal to 0, which is quite weird since it's cross-entropy loss.

If I load your pre-trained model, it is not equal to 0. Do you know the reason?

Answer 1 · 2023-04-11T08:50:32.000Z

Hi, I'm trying to reproduce the result of XSUM from scratch too, using recommended parameters in README. But I cannot reproduce the ROUGE score, which is much lower than the score reported in the paper. Any suggestions? @qiweizhen , thank you!

Answer 2 · 2023-04-17T06:15:00.000Z

Diffusion models w/o pre-training often require more training steps. If you want to reproduce the results from scratch, you need to set the --lr_anneal_steps to more (e.g. Xsum 400k steps). We hope this suggestion can help you.
We have noticed that our description of training from scratch in README have caused some misunderstandings, and we will update and correct them in the next version. Thank you for your feedback.