torch-transformer Transformer based architectures Currently in the repo: The Transformer from the paper "Attention is All You Need"