tspeterkim/flash-attention-minimal

Flash Attention in ~100 lines of CUDA (forward pass only)

CudaApache-2.0

Issues

Does this rep surport tensorcore?
#5 opened 5 months ago by Rane2021
2
expect implementation of flash attention-v2 and flash-decoding
#6 opened 4 months ago by wisdom-miao
0
slow in for loop test
#3 opened 6 months ago by DefTruth
2
Correctness parameters
#1 opened 6 months ago by cogumbreiro
1