/Mega-pytorch

Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena

Primary LanguagePythonMIT LicenseMIT

Issues