Cosine Annealing
torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max, eta_min=0, last_epoch=-1)
Source here
Takes ideas from here
Limited-memory BFGS:
torch.optim.LBFGS(params, lr=1, max_iter=20, max_eval=None, tolerance_grad=1e-05, ,...
Source here
Stochastic Weight Averaging:
Source here