I have flash attention installed but I got the ImportError: Flash Attention 2.0 is not available.
luisegehaijing opened this issue · 1 comments
luisegehaijing commented
yumemio commented
One of the good things about OSS is ease of debugging - you can see what's going wrong by reading the library's source code!
In this case, the transformers
library's is_flash_attn_2_available()
is returning False
:
def is_flash_attn_2_available():
if not is_torch_available():
return False
if not _is_package_available("flash_attn"):
return False
# Let's add an extra check to see if cuda is available
import torch
if not torch.cuda.is_available():
return False
if torch.version.cuda:
return version.parse(importlib.metadata.version("flash_attn")) >= version.parse("2.1.0")
elif torch.version.hip:
# TODO: Bump the requirement to 2.1.0 once released in https://github.com/ROCmSoftwarePlatform/flash-attention
return version.parse(importlib.metadata.version("flash_attn")) >= version.parse("2.0.4")
else:
return False
So make sure that:
- PyTorch is available (
import torch
) - Flash Attention is available (
import flash_attn
) - PyTorch can see CUDA (
torch.cuda.is_available() == True
). If it can't, check out this StackOverflow question - Check the version of
flash_attn
(flash_attn.__version__
should be >= 2.1.0, assuming you're using a NVIDIA GPU)