softmax1/Flash-Attention-Softmax-N

CUDA and Triton implementations of Flash Attention with SoftmaxN.

PythonGPL-3.0

Issues

No support attn_mask != None in flash_attention_n
#39 opened 4 months ago by PeiqinSun
0
Support additional data types
#16 opened a year ago by christopher-w-murphy
1
Add Dropout
#15 opened a year ago by christopher-w-murphy
1
Parameterize the Flash Attention
#14 opened a year ago by christopher-w-murphy
2