opengear-project/GEAR

Integration with FlashAttention

ThisisBillhe opened this issue · 1 comments

Hi Hao Kang,

I notice that the FlashAttention is disabled in GEAR. Can we use GEAR along with FlashAttention? Is there any challenge here?

I do not think there any challenges here. GEAR is a plug-and-play algorithm. Just be sure that before run fwd_flashattn, decompression the KV caches.