/transformer

The Transformer model implemented from scratch using PyTorch. The model uses weight sharing between the embedding layers and the pre-softmax linear layer. Training on the Multi30k machine translation task is shown.

Primary LanguagePython

Issues