/libflash_attn

Standalone Flash Attention v2 kernel without libtorch dependency

Primary LanguageC++BSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

The flash attention v2 kernel has been extracted from the original repo into this repo to make it easier to integrate into a third-party project. In particular, the dependency on libtorch was removed.

As a consquence, dropout is not supported (since the original code uses randomness provided by libtorch). Also, only forward is supported for now.

Build with

mkdir build && cd build
cmake ..
make

It seems there are compilation issues if g++-9 is used as the host compiler. We confirmed that g++-11 works without issues.