modern-fortran/neural-fortran

Implement Adam optimizer

milancurcic opened this issue · 6 comments

Proposed by @rweed in #114.

Paper: https://arxiv.org/abs/1412.6980

Currently, the optimizers module is only a stub and the only available optimizer (SGD) is hardcoded in the network % train method, with updating of weights progataing all the way down to individual concrete layer implementations. Some refactoring is needed to decouple the weight updates from concrete layer implementations and to allow defining optimizer algorithms in their own concrete types.

rweed commented

Milan

FYI, I found one Fortran implementation of Adam at
https://github.com/thchang/NN_MOD
Unfortunately, the comments in the code appear to suggest it was implemented
but never tested. Still looking for a batch normalization implementation (outside of Keras)
Also, one of my other "wants" , linear layers is trivial to implement. Took me all of about 5 minutes to do that in your existing code.

Thanks for the link to NN_MOD. I'd like to work on Adam first. I think it's easier to implement than batch norm, and it will drive the much needed refactor for optimizers in general (rather than them being hardcorded in the network % train subroutine).

Would you like to contribute the linear layer here as a PR? As I understand it, it's just a dense layer but without an activation. Are you just using a dense layer but with a "no-op" activation function (i.e. y = x)?

rweed commented

@Spnetic-5, would you like to tackle this one next? I forgot whether you have a WIP implementation of Adam or AdaGrad?

Yes, Adam optimizer implementation is under progress; I'll make a PR soon.

Done by #150.