Dao-AILab/flash-attention

Typo in paper?

jhss opened this issue · 0 comments

jhss commented

In Appendix B.1, the author said that $o_i=P_{i:} \mathbf{V}=\sum_j P_{i j} v_j$ where $o_i$ is i-th column of O and $v_j$ is j-th column of V.

But I think $o_i = P V_{:i}$ is correct, because $$o_{ji} = \sum_{k=1}^N P_{jk}V_{ki}$$

I would appreciate it if you could confirm whether my suspicion is correct.