Which version of Flash Attention has been used in this project?
Closed this issue · 2 comments
14H034160212 commented
Hi,
I find the project is very interesting. Can I ask which Flash Attention has been used in this project?
From the official flash attention project, they have provided flash attention and flash attention v2.
https://github.com/Dao-AILab/flash-attention
Kind regards,
Qiming
NormXU commented
@14H034160212 Thank you for your interest. I used PyTorch's Scaled Dot-Product Attention to speed up inference speed instead of Dao's implementations.
14H034160212 commented