Training from saved checkpoint

Question

Training from saved checkpoint

Closed this issue 6 years ago · 1 comments

When I try to train from a saved checkpoint, model_builder.py#L24 raise an exception that the dict optim doesn't have an attribute optimizer.
Since the trainer_builder.py#L256 saves the state dict in the checkpoint file, the optim is a dict, not an Optimizer object.

One possible way is to change the loading method of optim in the beginning of model_builder.py:

def build_optim(args, model, checkpoint):
    """ Build optimizer """
    saved_optimizer_state_dict = None
    optim = Optimizer(
        args.optim, args.lr, args.max_grad_norm,
        beta1=args.beta1, beta2=args.beta2,
        decay_method=args.decay_method,
        warmup_steps=args.warmup_steps, model_size=args.enc_hidden_size)
    if args.train_from != '':
        saved_optimizer_state_dict = checkpoint['optim']

Answer 1 · 2019-06-22T20:09:33.000Z

Thanks, this has been fixed.