/linear-attention-transformer

Transformer based on a variant of attention that is linear complexity in respect to sequence length

Primary LanguagePythonMIT LicenseMIT

Issues