Recent LR Scheduler change does not account for inference/evaluation

Question

Recent LR Scheduler change does not account for inference/evaluation

dashstander opened this issue 8 months ago · 0 comments

The function setup_model_and_optimizer is used for evaluation and inference as a hack to initialize DeepSpeed properly. However, there was a recent change to make sure that the LR scheduler is properly updated after resuming training from a checkpoint that assumes there will be an lr_scheduler object. Right attempting to run evaluate.py from NeoX main gives

File "/mnt/ssd-1/dashiell/gpt-neox/megatron/utils.py", line 448, in setup_for_inference_or_eval
        lr_scheduler.optimizer = model.optimizerlr_scheduler.optimizer = model.optimizer

    lr_scheduler.optimizer = model.optimizerAttributeError
AttributeError: : 'NoneType' object has no attribute 'optimizer''NoneType' object has no attribute 'optimizer'