Report a bug that causes instability in training
pengzhangzhi opened this issue · 1 comments
Hi, I would like to report a bug in the rotation
, that causes instability in training.
The IPA Transformer is similar to the structure module in AF2, where the recycling is used. Note that we usually detach the gradient of rotation
, which may causes instability during training. The reason is that the gradient of rotation
would update the rotation
during back propagation, which results in the instability based on experiments. Therefore we usually detach the rotation
to dispel the updating effect of gradient descent. I have seen you do this in your alphafold2 repo (https://github.com/lucidrains/alphafold2).
If you think this is a problem, please let me know. I am happy to submit a pr to fix that.
Best,
Zhangzhi Peng
70e685c Cool! I've added the ability to detach rotations