/Rethinking-attention

My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT pretrained models.

Primary LanguagePythonMIT LicenseMIT

Stargazers