thu-ml/SageAttention

Quantized Attention that achieves speedups of 2.1x and 2.7x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.

PythonBSD-3-Clause

Issues

Are you planning to provide a varlen and bnsd API?
#31 opened in 5 hours
1
关于SageAttention 性能为什么在RTX 4090 和 RTX 3090有明显效果
#29 opened 3 days ago
1
compatible with other quantization methos
#28 opened 6 days ago
2
Q matrix quantization
#27 opened 7 days ago
1
got result error when seq_length of q not equals to k/v
#26 opened 10 days ago
6
q_kernel_per_block_int8 error in distributed settings.
#25 opened 13 days ago
0
Why divide ln 2 in quantiation Q value?
#24 opened 13 days ago
1
all black video are generated for Open-Sora-Plan using sageattention
#23 opened 14 days ago
3
Real accelerated benefits
#22 opened 10 days ago
2
Why Running Llama infer in A10 get Wrong answer?
#21 opened 14 days ago
4
Can SageAttention available on AMD GPUs?
#20 opened 15 days ago
1
exist nan when using sageattn
#19 opened 15 days ago
5
Notation error in Equation (2)
#18 opened 15 days ago
1
Would support other headdim
#17 opened 16 days ago
2
Other SageAttention Kenerls
#16 opened 16 days ago
0
Do you plan to integrate this algorithm into the vllm project?
#15 opened 17 days ago
0
遇到些兼容性问题
#14 opened 17 days ago
1
Can you provide an example for LLaMA?
#13 opened 18 days ago
1
Question about INT8 v.s. FP8
#12 opened 21 days ago
1
SageAttention on ComfyUI
#11 opened 24 days ago
2
Accuracy Comparson in Kernel Level
#10 opened a month ago
2
is support stable diffusion?
#9 opened a month ago
3
Would something like this be possible for apple silicon?
#8 opened a month ago
0
Windows compile issue when testing CogVideoX script
#7 opened a month ago
8
After update diffusers CogVideoX fails due to the dtype check
#6 opened a month ago
3
Can it possible make a PR merge it into Flashatten?
#5 opened a month ago
1
How can I make it work on Windows?
#4 opened a month ago
3
Question about performance on A100
#3 opened a month ago
6
BF16 q,k,v
#2 opened a month ago
2
Example usage doesn't work
#1 opened a month ago
1