simple-parallel-transformer As it says on the tin, this repo has a simple implementation of a transformer model, with some borrowed efficiency improvements. The purpose is mainly pedagogical.