lucidrains/invariant-point-attention

Report a bug that causes instability in training

pengzhangzhi opened this issue · 1 comments

Hi, I would like to report a bug in the rotation, that causes instability in training.

rotations = quaternion_to_matrix(quaternions)

The IPA Transformer is similar to the structure module in AF2, where the recycling is used. Note that we usually detach the gradient of rotation, which may causes instability during training. The reason is that the gradient of rotation would update the rotation during back propagation, which results in the instability based on experiments. Therefore we usually detach the rotation to dispel the updating effect of gradient descent. I have seen you do this in your alphafold2 repo (https://github.com/lucidrains/alphafold2).

If you think this is a problem, please let me know. I am happy to submit a pr to fix that.

Best,
Zhangzhi Peng

70e685c Cool! I've added the ability to detach rotations