Target optimizer not set properly when loading from state dict
Opened this issue · 0 comments
stefan-baumann commented
When loading the GradualWarmupScheduler
from a state dict to resume a training, the optimizer
attribute of the nested after_scheduler
is loaded from the state_dict
. This causes a static learning rate after resuming a training, as the after_scheduler
tries to update the learning rate of an optimizer that doesn't match the one used by the resumed training. Setting self.after_scheduler.optimizer = self.optimizer
as a part of the load_state_dict()
method should probably suffice to fix this.