How to modify the optimizer to achieve gradient accumulation

Question

How to modify the optimizer to achieve gradient accumulation

adjawdka opened this issue 7 days ago · 0 comments

When reproducing the code, the training accuracy is slightly lower than the paper, and the GPU does not support large batch sizes. How to set gradient accumulation.