Tony-Y/pytorch_warmup

Why did my learning rate drop from the initial lr

sjchasel opened this issue · 3 comments

In every batch, I execute

loss.backward()
optimizer.zero_grad()
optimizer.step()
with warmup_scheduler.dampening():
    lr_scheduler.step()

It doesn't have a warm up process.

Your code does not optimize model parameters at all because optimizer.zero_grad() is called after loss.backward(). If you cannot understand this, please read the following tutorial:

https://pytorch.org/tutorials/beginner/basics/optimization_tutorial.html#optimizer

Did you solve this issue?