moskomule/memory_efficient_attention.pytorch

A human-readable PyTorch implementation of "Self-attention Does Not Need O(n^2) Memory" (Rabe&Staats'21).

PythonApache-2.0

Readme
1Issue
8Stargazers
2Watchers

Stargazers

Birch-san
some financial technology company
guanfuchen
Zhejiang University
himkt
Tokyo, Japan
JeffCarpenter
Canada
numb3r3
@jina-ai
oleg-kachan
sun254667307
Swall0w
FA

Contact site admin: Geeks.