JUGGHM/PENet_ICRA2021

Question about lr

rightchose opened this issue · 5 comments

def adjust_learning_rate(lr_init, optimizer, epoch, args):

In training stage three, in your paper, Finally, we train the full model with an initial learning rate of 0.02 and 0.002, respectively, for the weights in the backbone and DA-CSPN++. . But every iter, the using of adjust_learning_rate will adjust all params (backbone and DA-CSPN++) with same lr?

The optimizer in stage 3 with different learning rate corresponding to different parameters is defined in main.py:

    elif (args.network_model == 'pe'):
        model_bone_params = [
            p for _, p in model.backbone.named_parameters() if p.requires_grad
        ]
        model_new_params = [
            p for _, p in model.named_parameters() if p.requires_grad
        ]
        model_new_params = list(set(model_new_params) - set(model_bone_params))
        optimizer = torch.optim.Adam([{'params': model_bone_params, 'lr': args.lr / 10}, {'params': model_new_params}],
                                     lr=args.lr, weight_decay=args.weight_decay, betas=(0.9, 0.99))

In training stage three, in your paper, Finally, we train the full model with an initial learning rate of 0.02 and 0.002, respectively, for the weights in the backbone and DA-CSPN++. . But every iter, the using of adjust_learning_rate will adjust all params (backbone and DA-CSPN++) with same lr?

I know it. But in iterate function, it will using

if mode == 'train':
        model.train()
        lr = helper.adjust_learning_rate(args.lr, optimizer, actual_epoch, args)

When the code first run at here. It't will apply adjust_learning_rate. And this function.

def adjust_learning_rate(lr_init, optimizer, epoch, args):
    """Sets the learning rate to the initial LR decayed by 10 every 5 epochs"""
    #lr = lr_init * (0.5**(epoch // 5))
    #'''
    lr = lr_init
    if (args.network_model == 'pe' and args.freeze_backbone == False):
        if (epoch >= 10):
            lr = lr_init * 0.5
        if (epoch >= 20):
            lr = lr_init * 0.1
        if (epoch >= 30):
            lr = lr_init * 0.01
        if (epoch >= 40):
            lr = lr_init * 0.0005
        if (epoch >= 50):
            lr = lr_init * 0.00001
    else:
        if (epoch >= 10):
            lr = lr_init * 0.5
        if (epoch >= 15):
            lr = lr_init * 0.1
        if (epoch >= 25):
            lr = lr_init * 0.01
    #'''

    for param_group in optimizer.param_groups:
        param_group['lr'] = lr
    return lr

It will update all learning params with a lr of lr_init.

The optimizer has two groups params with different learning rate as defined in main.py. But in iterate, the function adjust_learning_rate updates the two groups params simultaneously with the same learning rate.

I think you're right so the parameters are actually updated with the same learning rate. It does be a mistake. The design of different learning rates comes from a common practice of some semantic segmentation networks that the parameters in the pretrained backbone are updated with 1/10 learning rate. Now I don't know whether it will work. Maybe you could try it.