Algorithm Mismatch
jinmang2 opened this issue · 3 comments
Paper Implementation
In the implementation, we get blocked Q, K, V tensors by level with the code below.
h-transformer-1d/h_transformer_1d/h_transformer_1d.py
Lines 164 to 179 in 110cab0
And return the final result of matrix-matrix product with Equation 29 or 69 with the for loop below.
h-transformer-1d/h_transformer_1d/h_transformer_1d.py
Lines 234 to 247 in 110cab0
What is problem?
However, according to the current code, it is not possible to include information about the level 0 white blocks in the figure below. (Equation 70 of the paper includes the corresponding attention matrix entries.)
I think you should also add an off-diagonal term of near-interaction (level 0) to match Equation 69!
@jinmang2 Hi MyungHoon! I think you are right, thank you for catching this bug - I've released the changes in 0.1.6 https://github.com/lucidrains/h-transformer-1d/releases/tag/0.1.6 , do you want to review this and see if it resolve the issue you described?
Thanks for solving it so quickly 😄
Thank you for the opportunity to review, I will check the 0.1.6 version code and leave a comment!
closing, since i think its fine now