softmax1/Flash-Attention-Softmax-N
CUDA and Triton implementations of Flash Attention with SoftmaxN.
PythonGPL-3.0
Stargazers
- AIexanderDicke@webcomputing
- Asthestarsfalll
- dfd
- DTennantShanghai
- Erland366
- evanmillerAnthropic
- evelynmitchellFort Collins, CO
- fly51flyPRIS
- fortressRain
- gijskoningCorbotics
- goodboyyes2009
- Infi-zc
- joey00072
- L1aoXingyuBeijing, China
- leslyarun
- liujingcsMonash University
- lxuechenStanford University
- MarcellusZhaoÉcole Polytechnique Fédérale de Lausanne
- MARD1NOSiliconFlow
- mcapodiciSydney, AU
- okotakuOrange
- photomzGalileo AI
- pierrot-lcInria
- radarFudanNUS
- rightchoseZhejiang University
- sainivedh
- samlkrystof
- SandalotsVolcanak
- sungkim11
- SushantDaga
- sustcsonglinMIT
- thepushkarpSamsung R&D Institute India
- tlin-taolin@epfml
- wang-debug
- YangWang92
- yzhangcsSoochow University