/Attention-Sink

[ATTRIB @ NeurIPS 2024] When Attention Sink Emerges in Language Models: An Empirical View

Primary LanguagePythonMIT LicenseMIT

No issues in this repository yet.