/Transformer_model

Investigation into Transformer self-attention building blocks, and the effects of pretraining.

Primary LanguagePython

Stargazers