Transformers from scratch The modern transformer-like models architechture often confuses me, and this is quite frustrating. To overcome this I'll try to implement transformer model from scratch as an exercise