Issues
- 0
RuntimeError: Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its traceback): /home/rkuo/.local/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE
#1019 opened by rkuo2000 - 1
can we use this with project like XInference ?
#1018 opened by HakaishinShwet - 2
Logit soft-capping
#1016 opened by kabachuha - 1
build failure
#1017 opened by alxmke - 2
Question on FA-2 worker scheme
#1015 opened by DianCh - 0
flash-attn 1.0.4 cannot be installed
#1014 opened by tangxj98 - 0
/root/flash-attention-main/csrc/flash_attn/src/flash_fwd_kernel.h:7:10: fatal error: cute/tensor.hpp: No such file or directory
#1013 opened by centyuan - 2
FlashAttention-2 backprop question
#1012 opened by DianCh - 1
Variable memory allocation with varlen kernels
#1011 opened by CodeCreator - 2
Does the new flash-attention support ROCm?
#965 opened by JiahuaZhao - 1
- 0
Availability of wheel
#1009 opened by nikonikolov - 3
- 4
Does flash attention support FP8?
#985 opened by Godlovecui - 0
Global attention to <CLS> tokens with window_size arg
#1001 opened by PetrGarm - 0
Unable to build wheel of flash_attn
#1007 opened by zeiyu314 - 1
FlashAttention Pytorch Integration
#1005 opened by DianCh - 0
flash attention is broken for cuda-12.x version
#1004 opened by Bhagyashreet20 - 1
Is it possible to access the intermediate calculation of q * k multiplication with Flash Attention?
#1003 opened by aciddelgado - 4
Clarification on how to use flash decoding, is it available when calling `flash_attn_func`?
#1002 opened by btruhand - 4
Inappropriate Number of Splits Predicted by determine_num_splits function in flash_attn_with_kvcache ( Non-paged kv cache )
#1000 opened by izhuhaoran - 15
ImportError: /home/linjl/anaconda3/envs/sd/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC2ENS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
#975 opened by zzc0208 - 1
did NVIDIA L40 support flash-attention2?
#999 opened by qinsang - 0
- 1
谁成功在jetson上使用了 flash_attn
#986 opened by cthulhu-tww - 2
[BUG] Returning a pointer targeting a local variable
#997 opened by FC-Li - 1
[training] `python run.py` raises `ImportError: cannot import name 'GPTBigCodeConfig' from 'transformers'`
#996 opened by yumemio - 7
CUDA12.1 build got a extremly long time, about 2 hours still compling
#968 opened by MonolithFoundation - 6
Error Installing FlashAttention on Windows 11 with CUDA 11.8 - "CUDA_HOME environment variable is not set"
#982 opened by Mr-Natural - 4
- 2
Error when running flash_attn_func
#994 opened by obhalerao97 - 4
- 1
I have flash attention installed but I got the ImportError: Flash Attention 2.0 is not available.
#990 opened by luisegehaijing - 4
- 4
[QST] Question about the Dropout
#993 opened by flytigerw - 0
ImportError: libtorch_cuda_cpp.so: cannot open shared object file: No such file or directory
#992 opened by jxxtin - 0
Error in Algorithm 1 of Flash Attention 2 paper
#991 opened by mbchang - 4
Deterministic Disscusion
#988 opened by YizhouZ - 2
Which dropout type is used for flash attention?
#987 opened by Avelina9X - 7
ImportError: flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEi
#966 opened by foreverpiano - 2
page not found in setup.py
#979 opened by kishida - 1
Apple Silicon Support
#977 opened by chigkim - 1
Fatal error
#972 opened by ByungKwanLee - 4
DropoutAddRMSNorm using triton backend
#973 opened by fferroni - 2
flash attn安装后测试报错
#971 opened by zhangfan-algo - 4
how to install flash_attn in torch==2.1.0
#969 opened by foreverpiano - 1
Controlling stride of local attention window
#967 opened by EomSooHwan - 1
need a flash_attn-2.5.2+cu122torch2.2.0cxx11abiFALSE-cp312-cp312-win_amd64.whl whl
#959 opened by wallfacers - 2
ModuleNotFoundError: No module named 'einops'
#961 opened by zhangfan-algo - 0
I successfully compiled the flash_attn on window with cuda 12.1.1 for python 3.11
#962 opened by cyysky