/flash-linear-attention

🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton

Primary LanguagePythonMIT LicenseMIT

Watchers