neulab/awesome-align

Relationship between num_train_epochs and max_steps

dmitrytoda opened this issue · 2 comments

I did training with num_train_epochs=1 and max_steps=20000. It did 1 epoch of 20k steps, all good.
Then I did training with num_train_epochs=2 and max_steps=20000. I expected it to do 2 epochs of 20k steps each, but instead it only did 20k steps total.

So if I want to train longer, should I just change max_steps to say 40000? and leave num_train_epochs=1? but what does num_train_epochs do then?

Hi, if num_train_epoch=n and it takes m steps to go through one epoch, the total training step would be min(m*n, max_steps).

If I set num_train_epochs=1 and max_steps=20000, then if m>20k, the program would train the model for 20k steps, otherwise it would train the model for m steps.

Thank you, that explains it!