
Yet another Toy Pretrain(able) Autoregressive Transformer

Primary LanguagePython

Story Generator using Pretrained Autoregressive Transformer Model


This is a Python implementation of a story generator using a transformer model. The model is trained on the TinyStoriesV2 dataset and can complete stories based on a given prompt.


  • Generates stories based on a given prompt
  • Uses a transformer model to generate text
  • Includes data loading and preprocessing utilities
  • Supports training and evaluation of the model

Technical Details

  • The model is implemented using torch and its scaled dot-product multi-head attention implementation
  • Tokenizer used is tiktoken
  • The model is trained using a custom training loop utilizing cosine annealing and learning rate warmup

How to Use

  • Install the required dependencies using pip install -r requirements.txt
  • Download your dataset and place it in the data directory
  • Train the model using python main.py
  • Generate stories using python generate.py
  • Deploy the streamlit app using python app.py

Future Improvements:

  • Implement control using a configuration file.
  • Explore different model architectures and hyperparameters.
  • Integrate larger and more diverse datasets for training.
  • Add functionality for user-specified story themes or genres.


Ritav Jash


This project is licensed under the MIT License.