Implementation of transformer language model in PyTorch