OpenMOSS/CoLLiE

在3090上使用collie微调moss7B,flash_attn报错

Closed this issue · 2 comments

RuntimeError: FlashAttention backward for head dim > 64 requires A100 or H100 GPUs as the implementation needs a large amount of shared memory.

你好,moss 7B的head dim是128,flash-attn在3090上不支持,可以设置config.use_flash=False

ZiboZ commented

你好 FlashAttention不是根据 L1缓存的大小来划分块大小的嘛 理论上V100应该也可以支持的吧