Integration with FlashAttention
ThisisBillhe opened this issue · 1 comments
ThisisBillhe commented
Hi Hao Kang,
I notice that the FlashAttention is disabled in GEAR. Can we use GEAR along with FlashAttention? Is there any challenge here?
HaoKang-Timmy commented
I do not think there any challenges here. GEAR is a plug-and-play algorithm. Just be sure that before run fwd_flashattn, decompression the KV caches.