NotImplementedError:No operator found for `memory_efficient_attention_forward` with inputs:

Question

NotImplementedError:No operator found for `memory_efficient_attention_forward` with inputs:

Opened this issue 4 months ago · 4 comments

There's an error message at the KSamplerTheMisto node, and the workflow can't complete. My GPU is a 2080ti.

"NotImplementedError: No operator found for memory_efficient_attention_forward with inputs:
query : shape=(24, 5376, 1, 128) (torch.bfloat16)
key : shape=(24, 5376, 1, 128) (torch.bfloat16)
value : shape=(24, 5376, 1, 128) (torch.bfloat16)
attn_bias : <class 'NoneType'>
p : 0.0
decoderF is not supported because:
attn_bias type is <class 'NoneType'>
bf16 is only supported on A100+ GPUs
flshattF@v2.5.7 is not supported because:
requires device with capability > (8, 0) but your GPU has capability (7, 5) (too old)
bf16 is only supported on A100+ GPUs
cutlassF is not supported because:
bf16 is only supported on A100+ GPUs
smallkF is not supported because:
max(query.shape[-1] != value.shape[-1]) > 32
dtype=torch.bfloat16 (supported: {torch.float32})
bf16 is only supported on A100+ GPUs
unsupported embed per head: 128"

Answer 1 · 2024-09-02T03:11:23.000Z

Upgrade your pytorch, and try parallel loader on comfyui to run Flux model, FLux is a VRAM monster.

Answer 2 · 2024-09-05T03:41:12.000Z

Thanks for your suggestion! Which parallel loader are you referring to? Thanks!谢谢

Answer 3 · 2024-09-05T03:42:06.000Z

Upgrade your pytorch, and try parallel loader on comfyui to run Flux model, FLux is a VRAM monster.升级你的 pytorch，并在 comfyui 上尝试并行加载器来运行 Flux 模型，FLux 是一个 VRAM 怪物。
感谢您的建议！您指的是哪个并行加载器？谢谢！

Answer 4 · 2024-09-05T04:15:43.000Z

这是工作流，我的显卡是2080ti。Total VRAM 22528 MB, total RAM 32618 MB
pytorch version: 2.3.1+cu121
xformers version: 0.0.27
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 2080 Ti : native
Using xformers cross attention
错误提示是：
No operator found for memory_efficient_attention_forward with inputs: query : shape=(24, 3856, 1, 128) (torch.bfloat16) key : shape=(24, 3856, 1, 128) (torch.bfloat16) value : shape=(24, 3856, 1, 128) (torch.bfloat16) attn_bias : <class 'NoneType'> p : 0.0 decoderF is not supported because: attn_bias type is <class 'NoneType'> bf16 is only supported on A100+ GPUs flshattF@v2.5.7 is not supported because: requires device with capability > (8, 0) but your GPU has capability (7, 5) (too old) bf16 is only supported on A100+ GPUs cutlassF is not supported because: bf16 is only supported on A100+ GPUs smallkF is not supported because: max(query.shape[-1] != value.shape[-1]) > 32 dtype=torch.bfloat16 (supported: {torch.float32}) bf16 is only supported on A100+ GPUs unsupported embed per head: 128