关于学习率
JingyangXiang opened this issue · 2 comments
您好,我在复现论文的过程中,发现在pvpvt_s.txt 学习率最大为0.00125
{"train_lr": 1.000000000000015e-06, "train_loss": 6.913535571038723, "test_loss": 6.8714314655021385, "test_acc1": 0.2160000117301941, "test_acc5": 1.2940000837326049, "epoch": 0, "n_parameters": 24106216}
{"train_lr": 1.000000000000015e-06, "train_loss": 6.896081995010376, "test_loss": 6.839652865021317, "test_acc1": 0.40800002346038816, "test_acc5": 1.7700001041412354, "epoch": 1, "n_parameters": 24106216}
{"train_lr": 0.0002507999999999969, "train_loss": 6.628805226147175, "test_loss": 5.555466687237775, "test_acc1": 6.462000321006775, "test_acc5": 18.384000979614257, "epoch": 2, "n_parameters": 24106216}
{"train_lr": 0.0005006000000000066, "train_loss": 6.272622795701027, "test_loss": 4.64223074471509, "test_acc1": 15.328000774383545, "test_acc5": 34.724001724243166, "epoch": 3, "n_parameters": 24106216}
{"train_lr": 0.0007504000000000098, "train_loss": 5.958464104115963, "test_loss": 3.9152780064830073, "test_acc1": 24.868001231384277, "test_acc5": 48.68200244750977, "epoch": 4, "n_parameters": 24106216}
{"train_lr": 0.0010002000000000064, "train_loss": 5.670980889737606, "test_loss": 3.3432016747969167, "test_acc1": 33.47600182495117, "test_acc5": 59.10200291748047, "epoch": 5, "n_parameters": 24106216}
{"train_lr": 0.0012491503115478462, "train_loss": 5.421633140593767, "test_loss": 2.9919017029029353, "test_acc1": 39.19200199279785, "test_acc5": 65.3920035522461, "epoch": 6, "n_parameters": 24106216}
{"train_lr": 0.0012487765716255204, "train_loss": 5.202176849722862, "test_loss": 2.6157535275927297, "test_acc1": 45.32800238342285, "test_acc5": 71.13600388793945, "epoch": 7, "n_parameters": 24106216}
请问是根据batchsize进行缩放得到的学习率还是通过调整得到的学习率
非常感谢
lr正比于batchsize缩放,可以看下代码
非常感谢您的回复。
之前的疑惑在于,如果学习率上根据缩放得到,那么如果是8卡的话,单卡上batchsize上160,感觉是一个不怎么常见的配置,所以才提了issue