softmax1/Flash-Attention-Softmax-N
CUDA and Triton implementations of Flash Attention with SoftmaxN.
PythonGPL-3.0
Issues
- 0
- 1
Support additional data types
#16 opened by christopher-w-murphy - 1
Add Dropout
#15 opened by christopher-w-murphy - 2
Parameterize the Flash Attention
#14 opened by christopher-w-murphy