How to schedule LR with warmup on global_step initially, and then epoches after warmup?
JohnHerry opened this issue · 9 comments
Hi, Tony,
I have a request, that during warmup training in the first epoch, the warmup-scheduler can adjust learning rate every step [or every N steps], and after the warmup stage, we will use regular lr-schedular to adjust the learning-rate every epoch. Is there any example code about it?
Do you want one like an LR scheduler with warmup provided by Hugging Face?
I'm sorry for misunderstanding your request. You have to combine a global-step and epoch LR scheduler, but it could not be as easy as it sounds. I don't know such example codes, but I'll let you know if I find.
I made sample code: https://gist.github.com/Tony-Y/49d6cffa21e60095fdf9b1bec31cdbaa
for batch_idx, (data, target) in enumerate(progressbar(train_loader)):
...
extra_params["global_step"] += 1
if extra_params["global_step"] <= extra_params["warmup_period"]:
with warmup_scheduler.dampening():
pass
elif (extra_params["global_step"] - extra_params["warmup_period"]) % len(train_loader) == 0:
lr_scheduler.step()
I revised the code: https://gist.github.com/Tony-Y/1aa2196ce161d8a4da90cf027ec0f260
New code:
class EpochSchedulerWithWarmup:
def __init__(self, warmup_period, every_n_steps, steps_per_epoch, warmup_scheduler, lr_scheduler):
self.global_step = 0
self.warmup_period = warmup_period
self.every_n_steps = every_n_steps
self.steps_per_epoch = steps_per_epoch
self.warmup_scheduler = warmup_scheduler
self.lr_scheduler = lr_scheduler
def step(self):
self.global_step += 1
if self.global_step <= self.warmup_period and self.global_step % self.every_n_steps == 0:
with self.warmup_scheduler.dampening():
pass
elif (self.global_step - self.warmup_period) % self.steps_per_epoch == 0:
self.lr_scheduler.step()
Usage:
lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[num_epochs//3], gamma=0.1)
warmup_scheduler = warmup.UntunedLinearWarmup(optimizer)
lr_scheduler_with_warmup = EpochSchedulerWithWarmup(
warmup_period=2000, every_n_steps=1,
steps_per_epoch=len(dataloader),
warmup_scheduler=warmup_scheduler,
lr_scheduler=lr_scheduler)
for epoch in range(1,num_epochs+1):
for iter, batch in enumerate(dataloader):
optimizer.zero_grad()
loss = ...
loss.backward()
optimizer.step()
lr_scheduler_with_warmup.step()
Does this code resolve your issue?
EpochSchedulerWithWarmup
has a bug. Its step()
should be:
def step(self):
self.global_step += 1
if self.global_step <= self.warmup_period:
if self.global_step % self.every_n_steps == 0:
with self.warmup_scheduler.dampening():
pass
elif (self.global_step - self.warmup_period) % self.steps_per_epoch == 0:
self.lr_scheduler.step()
I checked every_n_steps
works well: https://gist.github.com/Tony-Y/6c79267cab84f3d0a2309f25a9123da4
I think this issue was resolved. Reopen this issue if not.
Thank you very much for the kindly help. I will have a try.
Thank you very much for the kindly help. I will have a try.
Thank you for all the help, They are very effective! the training process is stable now. thanks
I'm happy to hear that.