https://peterbloem.nl/blog/transformers
https://e2eml.school/transformers.html
https://github.com/lucidrains/attention
https://www.borealisai.com/en/blog/tutorial-14-transformers-i-introduction/
https://towardsdatascience.com/illustrated-self-attention-2d627e33b20a
https://arthurdouillard.com/deepcourse/archi/
https://deepmind.com/blog/article/building-architectures-that-can-handle-the-worlds-data
http://jbcordonnier.com/posts/attention-cnn/
https://theaisummer.com/attention/
https://theaisummer.com/vision-language-models/
https://transformer-circuits.pub/2021/framework/index.html
https://arxiv.org/pdf/2207.09238.pdf
http://www.columbia.edu/~jsl2239/transformers.html