RoadrunnerX -- tune LLaMa3.1 on Google Cloud TPUs for 30% lower cost and scale seamlessly!

RoadRunnerX is a framework for continued-training and fine-tuning open source LLMs using XLA runtime. We take care of neceessary runtime setup and provide a Jupyter notebook out-of-box to just get started.

Easy to use.
Easy to configure all aspects of training (designed for ML researchers and hackers).
Easy to scale training from a single VM with 8 TPU cores to entire TPU Pod containing 6000 TPU cores (1000X)!

Goal

Our goal at felafax is to build infra to make it easier to run AI workloads on non-NVIDIA hardware (TPU, AWS Trainium, AMD GPU, and Intel GPU).

Currently supported models

LLaMa-3/3.1 8B, 70B on Google Cloud TPUs.
- Supports LoRA and full-precision training.
- Tested on TPU v3, v5p.
LLaMa-3.1 405B will be available on our cloud platform at felafax.ai -- sign-up for the waitlist!
Gemma 2 2B on Cloud TPUs. $${\color{red}New!}$$
- Supports fast full-precision training.
- Tested on TPU v3, v5p.

Setup

For a hosted version with a seamless workflow, please visit app.felafax.ai.

If you prefer a self-hosted training version, follow the instructions below. These steps will guide you through launching a TPU VM on your Google Cloud account and starting a Jupyter notebook. With just 3 simple steps, you'll be up and running in under 10 minutes. 🚀

Install gcloud command-line tool and authenticate your account (SKIP this STEP if you already have gcloud installed and have used TPUs before! 😎)

 # Download gcloud CLI
 curl https://sdk.cloud.google.com | bash
 source ~/.bashrc

 # Authenticate gcloud CLI
 gcloud auth login

 # Create a new project for now
 gcloud projects create LLaMa3-tunerX --set-as-default

 # Config SSH and add
 gcloud compute config-ssh --quiet

 # Set up default credentials
 gcloud auth application-default login

 # Enable Cloud TPU API access
 gcloud services enable compute.googleapis.com tpu.googleapis.com storage-component.googleapis.com aiplatform.googleapis.com

Spin up a TPU v5-8 VM 🤠.
```
sh ./launch_tuner.sh
```
Keep an eye on the terminal -- you might be asked to input SSH key password and need to put in your HuggingFace token.
Open the Jupyter notebook at https://localhost:888 and start fine-tuning!

Credits:

PyTorch XLA FSDP and SPMD testing done by HeegyuKim.
Examples from PyTorch-XLA repo.

funkytaco/RoadrunnerX

RoadrunnerX -- tune LLaMa3.1 on Google Cloud TPUs for 30% lower cost and scale seamlessly!

Goal

Currently supported models

Setup

Credits: