/tuna

fine tuning, reimagined. welcome to tuna 🎣 - we're simplifying cloud compute architecture, datasets, and more, to get your specialized AI from 0->100 asap

Primary LanguagePythonMIT LicenseMIT

🎣 Tuna

PyPi MacOS Linux Windows Support Star

fine tuning, reimagined

tuna is your one stop (open-source) shop for fine tuning a code generation model on any codebase, available publically or privately on GitHub (more VCS support soon!).

We simplify the entire process, meaning that all you have to do to build out your perfect model, settings and all, is simply follow the initialization command below!

Don't have an NVIDIA GPU? Don't worry! Make sure you have an RSA SSH Key available at ~/.ssh/id_rsa.pub, and set up an account and API key on our GPU provider, FluidStack. Minimal prices, maximum development.

If you're concered about Data Privacy and Data Collection, note that Tuna does not collect any data on you, and is entirely open source. Check out the "Data Collection" section below to learn more.

We'd love if you gave us a ⭐, as that's our primary way of tracking user interest! If you're feeling extra generous, you can click Sponsor ❤️. Thank you so much for reading!

Questions? Contact abhi[at]opennote.me.

Note: Tuna is currently only supported on MacOS and Linux, Windows is coming soon...

Documentation

Getting Started

To install tuna, make sure you have Python 3.12+ installed on your machine, then simply run the below command:

pip install tuna-cli

This will make tuna executable from anywhere on your machine.

Commands

1. Initialize

tuna init

# Initializes a `.tuna` folder
# Authenticates your GitHub credentials
#   - This asks for a GitHub Token
#     which MUST have READ REPO and READ USER permissions
# Lets you select a repository
# Builds a Model Training Dataset
# Sets up Jupyter Environment

2. Serve

tuna serve
# Runs a Local Jupyter Environment with the
# autogenerated notebook and dataset,
# with CPU and Memory monitoring

# By default, this doesn't open the browser
# automatically. Run:
tuna serve --open
# to do that

3. Refresh

tuna refresh
# Recreates the dataset after updating
# from your GitHub project, in case you made
# edits after initializing with Tuna

4. Train (Coming Soon)

tuna train
# Begins to train the dataset with a powerful GPU from
# FluidStack (see intro)

# To train locally on current hardware, run
tuna train --local
# (must be on a device with an NVIDIA GPU, since Tuna relies on CUDA)

5. Helpers

tuna help
# or
tuna github
# or
tuna docs

# All of these will open the GitHub repository for Tuna, where all the documentation
# is served in the README.md file.

6. Purge

tuna purge

# This will delete all tuna-generated files
# in your current directory
# USE WITH CAUTION!

7. No Flags

tuna

# Displays a welcome message

Data Collection

  • After installation of the CLI tool, Tuna is entirely localized on your system. Outside of GPU rental services that we associate with to allow training, we don't store nor transfer any data to any internal services. Tuna is strictly open source.

  • GitHub credentials including OAuth tokens, your username, and your stored repositories can be cleared by deleting the .tuna directory in the same spot that it was made, or by running tuna purge in that directory.

  • FluidStack API keys are also stored locally, by deleting the .tuna directory in the same spot that it was made, or by running tuna purge in that directory

  • All files pulled from GitHub are strictly stored in the datasets that you can find in the .tuna directory that gets made. We pull text directly from the GitHub API to save you unwanted files and dependency installs, and also to protect your environment variables.

  • Unless you share data explicity with us, we won't ever see your personal data.

  • Disclaimer: We do not own the models that we use for fine tuning, and their data policies are on their invididual websites. Look up your model of choice to learn more.

Licensing

  • Tuna is licensed under the MIT license. For more information on usage, contact me at the email at the start of this README.

Contributing

🎣 Happy Tun(a)ing!