Learning Rate for Gradientdescent
Opened this issue · 1 comments
GraboDan commented
Hi Jon,
When I use the normal gradientdescent function and I set alpha to zero (as your comment in that line implies that this will lead to a variable learning rate) the learning rate will actually stay at 0 the entire time and my error for the MNIST stays at ~90%. If I set alpha to a value it will use this as the first value and then decay it according to the equation in alphatau. So it works, but the comment is misleading. There seems to be only the option of using a variable learning rate. A constant seems not implemented.
Anyways, great code :)