/fused-attention

Fast and low-memory attention layer written in CUDA

Primary LanguageCuda

No issues in this repository yet.