/attention-rank-collapse

[ICML 2021 Oral] We show pure attention suffers rank collapse, and how different mechanisms combat it.

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Stargazers