Could u add Nesterov momentum in SGD
georgekasa opened this issue · 0 comments
georgekasa commented
Hello,
as i saw you are using SGD with momentum (default 0.9) could u add a feature to add Nesterov momentum
line: 375-376:
optimizer = torch.optim.SGD(params, lr=args.lr, momentum=args.momentum,
weight_decay=args.weight_decay,
nesterov=args.nesterov)
as Karpathy told in CS231n:
Nesterov Momentum is a slightly different version of the momentum update that has recently been gaining popularity. It enjoys stronger theoretical converge guarantees for convex functions and in practice it also consistenly works slightly better than standard momentum.
thank you in advance