/SoftmaxOutputApproximation

[NeurIPS 2023] Softmax Output Approximation for Activation Memory-Efficient Training of Attention-based Networks

Primary LanguagePython

Stargazers