Optimize Gradient Descent
Closed this issue · 5 comments
ZChosed commented
Possible options to increase speed of convergence using gradient descent:
- fine tune the learning rate up (this is dangerous if gradients can grow very large)
- use something like adagrad to guarantee convergence on all data sets (even with exterme gradients)
- use newtons method when descent slows past some metric (defining this will be data dependent and may be hard to find) to converge to a local min instantly
ethanuppal commented
@zephanrs also suggested trying SGD
sidharthmrao commented
@zephanrs also suggested trying SGD
I did too 🥺 😞
sidharthmrao commented
Maybe try sampling like 7 sets of 4 points in sequence or something when computing loss
ethanuppal commented
@zephanrs also suggested trying SGD
I did too 🥺 😞
Well you don't matter
sidharthmrao commented
@zephanrs also suggested trying SGD
I did too 🥺 😞
Well you don't matter