JOSS review: how to choose deltatol and niter?

Question

JOSS review: how to choose deltatol and niter?

Closed this issue 7 years ago · 2 comments

I was playing around with the stochastic quadratic function, and noticed that if deltatol is smaller than the variance of the added noise in the objective, then the algorithm will never converge (since even at the optimum, the function values are fluctuating too much). In general, we don't know what the variance in our objective is, so how should one go about choosing the deltatol parameter (or analogously, for minimizeSPSA, the niter parameter)? If it is too big, the algorithm exits prematurely, while if it is too small, it never returns. I don't know if you have tips on how to get around this problem, but adding some discussion in the documentation about the importance of deltatol might be helpful.

Answer 1 · 2017-05-23T15:36:53.000Z

You are right about the non-convergence if the function differences at distances of deltatol are smaller than the stochasticity. This problem is resolved if errorcontrol=True, but optimization still becomes exceedingly expensive if the variation of the function at the target accuracy is much smaller than the stochasticity. I added a comment about this to the docstring, see 0eba0e0.

deltatol is the target pattern size, i.e. it determines how precisely the optimum is determined. It should be chosen as large as possible. I have added a description of the termination criteria to the docstring for more clarity, see 9599387.

For the SPSA algorithm there is some discussion for suitable parameter choices in the literature (see http://www.jhuapl.edu/SPSA/PDF-SPSA/Spall_Implementation_of_the_Simultaneous.PDF), which I have now referenced in the docstring, see d3ce9bc

Answer 2 · 2017-05-23T17:40:48.000Z

Ah, interesting. Thanks for the references!