henrywoo/memory-efficient-attention-pytorch
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"
PythonMIT
Watchers
No one’s watching this repository yet.
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"
PythonMIT
No one’s watching this repository yet.