Fast and low-memory attention layer written in CUDA
Primary LanguageCuda
No issues in this repository yet.