Nano-Transformers, a project for Transformer related education and knowledge quick reference purpose.
This project targets at
- help you easily understand Transformers from detailed simple codes
- help you easily write Keras codes, with less complains on TensorFlow
- help you easily get investment from VC if you were working on web3
InstructGPT, not fully done yet.
- Tried to get
emergent
ability in small model small data. Failed. It can'treasoning
, butmemory
well.
- greentfrapp/attention-primer
- It's a TensorFlow 1 implementation. That project inspired nano-transformers.
- sainathadapa/attention-primer-pytorch
- It's a PyTorch implementation, forked from previous one.
- karpathy/nanoGPT
- It's a PyTorch implementation of GPT2, a rewrite of karpathy/minGPT.
- Andrej Karpathy was Tesla AI director in 2017-2022. The man behinds the magic. He is actively developing nanoGPT, we can learn a lot from his detailed optimizations.
- The 3 GPT images are captured from GPT1,2,3.
- Attention Free Transformer
- Rethinking Attention with Performers