dtheta update issue? transforming dtheta for getting theta?
Opened this issue · 1 comments
mamunir commented
updating dtheta gives higher than 1 in the parameters, even more than 1000? is it normal? I do not think so. Do I miss some point?
I re-implement in python
I send [dV value, U value, and dU value for every single coordinate respectively, x-y normalized coordinates] to backprop function.
@daerduoCarey your help in this regard is appreciated.
daerduoCarey commented
Training such a layer requires some careful tuning to control the gradients. Without your details of training, I don't think I can help. A practical good thing is to try adjust your learning rate to be smaller enough first to see if you still observe this gradient overflow. Or, you can try to make the lr for the stn to be smaller than the rest backbone.