Originally implemented in PyTorch by Andrej Karpathy :- "karpathy/minGPT".
I have other implemenation of GPT2 also in tensorflow, you can have a look at "akanyaani/gpt-2-tensorflow2.0"
Setup
$ git clone https://github.com/akanyaani/minGPTF
$ cd minGPTF
$ python setup.py install
Usage
For generating text using GPT2
$ open generate.ipynb
Here's how you'd instantiate a GPT-2 (124M param version):
$ from mingptf.model import GPT
$ model_config = GPT.get_default_config()
$ model_config.vocab_size = 50257 # openai's model vocabulary
$ model_config.block_size = 1024 # openai's model block_size (i.e. input context length)
$ model = GPT(model_config)
And here's how you'd train it:
$ from mingptf.model import GPT
$ model_config = GPT.get_default_config()
$ model_config.model_type = 'gpt-micro'
$ model_config.vocab_size = 50257
$ model_config.block_size = 128
$ model = GPT(model_config)
$ train_config = get_default_train_config()
$ train_config.learning_rate = 5e-4 # the model we're using is so small that we can go a bit faster
$ train_config.max_iters = 2000
$ model.configure_optimizers(train_config)
$ model.fit(train_data, test_data, test_freq=5)
TO DO
1. Tensorfboard loging.
2. Mixed precison training.
3. Fine-Tuning wrapper.
References:
- "karpathy/minGPT"
- "akanyaani/gpt-2-tensorflow2.0"
- "Openai/gpt-2"
- "Huggingface pytorch-transformers"
- "Tensorflow Transformers"
- "The Illustrated GPT-2 "
Contribution
- Your issues and PRs are always welcome.
Author
- Abhay Kumar
- Author Email : akanyaani@gmail.com
- Follow me on Twitter
License