Issues
- 1
FlashAttention versions
#878 opened - 1
abi true or false
#877 opened - 1
- 2
- 0
- 3
Error in install flash_attn
#873 opened - 1
- 3
- 0
- 1
will the flash attention embed self-extend?
#868 opened - 5
import flash attention errror
#867 opened - 4
- 0
- 2
Bug in loading of pretrained BERT weights
#864 opened - 0
Ask: Support FP8 KVCache in inference
#863 opened - 0
- 1
Broken source distribution
#861 opened - 6
- 2
- 0
Does rotary_kernel support packed qkv?
#855 opened - 4
ImportError: undefined symbol
#854 opened - 7
- 2
- 3
- 8
- 1
- 4
- 0
- 2
- 2
build fail
#843 opened - 1
- 5
- 5
- 1
- 2
Error with triton
#838 opened - 26
- 2
- 1
ESM NAN Values
#834 opened - 7
- 1
Q: Support for Nvidia L4
#830 opened - 0
c
#829 opened - 5
- 6
- 1
Support for Dynamic SplitFuse
#826 opened - 4
- 4
- 8
- 4
[Bug] Compatibility issue with torch 2.2.0
#821 opened - 1
Pip install --no-build-isolation Error on Win10 [C++ type casting Error during cuda extension build
#820 opened - 1
I would like to understand why rotary_dim has to be divisible by 16 in flash_attn_with_kvcache.
#817 opened