Get down and dirty with FlashAttention2.0 in pytorch, plug in and play no complex CUDA kernels
Primary LanguagePythonMIT LicenseMIT