Lower-memory batch optimization (L-BFGS)

Question

Lower-memory batch optimization (L-BFGS)

Opened this issue 12 years ago · 0 comments

The batch costs are sums. The derivatives are sums. No need to compute all data set at once, it wastes huge memory. How to give interface to not do this?

Idea - give same streams interface to lbfgs as fmin_sgd has. Algorithm iterates over all blocks accumulating gradient between each call to L-BFGS.