Something wrong when calculate t_total
Opened this issue · 1 comments
First of all, I really appreciate for the nice repo.
The t_total in run.py
is calculated by t_total = len(train_dataloader) // args.gradient_accumulation_steps * args.num_train_epochs
and the t_total
is passed into transformers.get_linear_schedule_with_warmup
. This indicates the total number of steps of the training process.
However, I guess the total nember of steps is calculated by the number of batches
* epoch
. Therefore, the code for calculating t_total
should be t_total = len(train_dataloader) // (args.train_batch_size * args.gradient_accumulation_steps) * args.num_train_epochs
If I'm wrong, please let me know what am I missing.
Hi @InhyeokYoo, i think in train_dataloader it's already do len(train_examples) // batch_size when call train_dataloader = Dataloader(batch_size=.....)