clmbot

A framework for training causal language models for bots.

Train a model

Training with Gradient is a simple and free way to get started. Just follow these steps:

Set up an account with Gradient.
Create a project on Gradient to manage your work. You can name the project anything you like.
Create a new workflow under your Gradient project. You can name the workflow anything you like.
Create the following datasets on Gradient:
- clmbot-config
- clmbot-data
- clmbot-models
Upload a copy of train.yml to the clmbot-config dataset. Edit the file before uploading if you'd like to change the default training parameters.
Upload one or more .txt files to the clmbot-data dataset.
In a terminal, set the environment variable GRADIENT_TRAIN_WORKFLOW_ID to the ID of your Gradient workflow. Then, run make train to start training.
Wait until training has completed.
In a terminal, set the environment variable GRADIENT_MODEL_DATASET_ID to the ID of the clmbot-models dataset (run gradient datasets list to see the IDs of your datasets). Then, run make fetch to download the trained model to your local machine.

Training with Python is not complicated, but you will probably need a GPU to do it in a reasonable amount of time. Just follow these steps:

Deploying a model is straightforward: