SVRG update at full gradient too late?
Opened this issue · 3 comments
@epapoutsellis and i saw differences between this SVRG implementation and a test-one from myself. Checking the code
StochasticCIL/Wrappers/Python/cil/optimisation/functions/SVRGFunction.py
Lines 81 to 114 in f9f67d5
I believe that at the "update full gradient" stage, the full gradient is not is actually not used to update the image.
self.stochastic_grad_difference
is not changed in this case, and that as that is used as return value, it means that the return value will actually be full_gradient + previous_grad_diff * N`, which is incorrect.
(I think the fact that at the first iteration the current code works is because self.stochastic_grad_difference
is initialised as zero).
Instead, I think the update_flag=True
case needs to have
if not out is None:
out.fill(self.full_gradient_at_snapshot)
return out
return self.full_gradient_at_snapshot
pinging @jakobsj @zeljkozeljko @MargaretDuff for confirmation
@KrisThielemans - do you want to try https://github.com/MargaretDuff/CIL-margaret/tree/svrg. This is the latest branch and the one we will (hopefully) be merging into CIL
That one doesn't have the above logic problem as far as I can see. I have a few minor comments on that function. Where/how do you want me to comment on it?
Still do not understand the problem and if it is a bug but I trust @KrisThielemans . Need to test if it passes cvxpy tests,
https://github.com/epapoutsellis/StochasticCIL/blob/svrg/Wrappers/Python/test/test_SVRGFunction.py
https://github.com/epapoutsellis/StochasticCIL/blob/svrg/Wrappers/Python/test/test_LSVRGFunction.py