Something wrong when calculate t_total

Question

Something wrong when calculate t_total

Opened this issue 3 years ago · 1 comments

First of all, I really appreciate for the nice repo.

The t_total in run.py is calculated by t_total = len(train_dataloader) // args.gradient_accumulation_steps * args.num_train_epochs and the t_total is passed into transformers.get_linear_schedule_with_warmup. This indicates the total number of steps of the training process.

However, I guess the total nember of steps is calculated by the number of batches * epoch. Therefore, the code for calculating t_total should be t_total = len(train_dataloader) // (args.train_batch_size * args.gradient_accumulation_steps) * args.num_train_epochs

If I'm wrong, please let me know what am I missing.

Answer 1 · 2022-05-16T08:52:00.000Z

Hi @InhyeokYoo, i think in train_dataloader it's already do len(train_examples) // batch_size when call train_dataloader = Dataloader(batch_size=.....)