Relationship between num_train_epochs and max_steps
dmitrytoda opened this issue · 2 comments
I did training with num_train_epochs=1
and max_steps=20000
. It did 1 epoch of 20k steps, all good.
Then I did training with num_train_epochs=2
and max_steps=20000
. I expected it to do 2 epochs of 20k steps each, but instead it only did 20k steps total.
So if I want to train longer, should I just change max_steps to say 40000? and leave num_train_epochs=1
? but what does num_train_epochs do then?
Hi, if num_train_epoch=n
and it takes m
steps to go through one epoch, the total training step would be min(m*n, max_steps).
If I set num_train_epochs=1
and max_steps=20000
, then if m>20k
, the program would train the model for 20k steps, otherwise it would train the model for m
steps.
Thank you, that explains it!