/transformer-aan

souce code for "Accelerating Neural Transformer via an Average Attention Network"

Primary LanguagePythonBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

Stargazers