/Linear-Multihead-Attention

Reproducing the Linear Multihead Attention introduced in Linformer paper (Linformer: Self-Attention with Linear Complexity)

Primary LanguagePython

Watchers