RussTedrake/manipulation

Fix Exercise 11.1

Opened this issue · 0 comments

The stochastic approximation has

x <- x - eta * [ l(x + w) - l(x) ] w

but the true gradient should be

x <- x - eta * [ l(x + w) - l(x) ] w / sigma^2

so we seem to be off at a scale.

This exercise is also bit misleading since it gives you the impression that we got out of local minima because we did zeroth-order, when in fact, we could have achieved the same effect by doing first order while injecting stochasticity.

I'd love to reimplement this problem with what we've learned with various gradient estimators over the past year.