This is a PyTorch implementation of the Transformer model in the paper Attention is All You Need
Primary LanguagePythonMIT LicenseMIT