/recurser

Reduce VRAM usage on transformer models

Primary LanguagePython

Recurser

A new model and implementation to reduce VRAM usage on transformer models.

Online demos

Reduce the VRAM usage of GPT2-XL by 25%. We can run GPT2-XL(float32) with Pytorch on the colab or with our gpu.

[Colab]

Installation

Always install the library from PyPI:

  pip install recursers

Todos

  • Re-implement recurser for other models
  • Enable MPS acceleration on Mac
  • Retraining: The model training of the recurser is a little different from the usual.

(back to top)

Reference

Karpathy's elegant GPT implementation
https://github.com/karpathy/nanoGPT

Hugging Face's library
https://github.com/huggingface/transformers