Hyperparameters with fairseq BART

Question

Hyperparameters with fairseq BART

shakeley opened this issue 2 years ago · 1 comments

Hi,

Thanks for the great dataset! I am reproducing the result in the paper using fairseq. Notice that "5,000 training steps/200 warmup steps and learning rate is set to 3e−5."

Besides these params, I use the following params:

MAX_TOKENS=2048
UPDATE_FREQ=4
MAX_EPOCH=6

My results are strange (only 37 for R-1). Could you provide more details for fairseq training?

Answer 1 · 2022-03-14T15:45:57.000Z

Hi @shakeley, have you seen the discussion in this issue? And did it solve your problem?
If not, can you send your model output to me so I can better see what I can do.