JOSS review: zeroth order optimization
Closed this issue · 2 comments
In the README and documentation, the package is motivated as being useful for stochastic optimization. I think it is also worth mentioning that your package is meant for zeroth-order (gradient free) optimization, to contrast it with things like stochastic gradient descent (a first-order method). I would suggest adding some text to the README and documentation emphasizing that this package is especially useful for cases where the gradient of the objective is hard or impossible to compute.
I agree that this needs clarification. As a first step I've updated the README and documentation to mention that both algorithms do not need explicit derivatives.
If I get around to it I will try and add a more detailed discussion to the documentation at some later time. Pattern search is a gradient free method, but while stochastic approximation (SPSA) does not require the gradient explicitly it basically is a gradient method using a finite difference approximation. Stochastic gradient descent uses the special problem structure (the noise comes from the choice of training examples) to explicitly evaluate the gradient but is also a kind of stochastic approximation algorithm.
Ah that makes sense. thanks for clarifying!