/MakeMultiHeadNaive

Use naive MultiheadAttention implement to replace nn.MultiheadAttention in pytorch

Primary LanguagePython

Watchers