🚀 Turbo-Alignment

Library for industrial alignment.

What is Turbo-Alignment?
Key Features
Supported Methods
Implemented metrics
How to Use
Installation
Development
Library Roadmap
FAQ
License

🌟 What is Turbo-Alignment?

Turbo-Alignment is a library designed to streamline the fine-tuning and alignment of large language models, leveraging advanced techniques to enhance efficiency and scalability.

✨ Key Features

📊 Comprehensive Metrics and Logging: Includes a wide range of metrics such as self-bleu, KL divergence, diversity, etc. all supported out of the box.
🛠️ Streamlined Method Deployment: Simplifies the process of deploying new methods, allowing for quick development and integration of new datasets and trainers into your pipelines.
📚 Ready-to-Use Examples: Convenient examples with configurations and instructions for basic tasks.
⚡ Fast Inference: Optimized for quick inference using vLLM.
🔄 End-to-End Pipelines: From data preprocessing to model alignment.
🌐 Multimodal Capabilities: Extensive support for various multimodal functions like Vision Language Modeling.
🔍 RAG Pipeline: Unique pipeline for end2end retrieval-augmented generation training.

🛠️ Supported Methods

Turbo-Alignment supports a wide range of methods for model training and alignment, including:

🎯 Supervised Fine-Tuning (SFT)
🏆 Reward Modeling (RM)
👍 Direct Preference Optimization (DPO)
🧠 Kahneman & Tversky Optimization (KTO) Paired/Unpaired
🔄 Contrastive Preference Optimization (CPO)
🎭 Identity Preference Optimisation (IPO)
🌟 Sequence Likelihood Calibration with Human Feedback (SLiC-HF)
📊 Statistical Rejection Sampling Optimization (RSO)
🌁 Vision Language Modeling using MLP from (LLaVA) or C-Abstractor from (HoneyBee) trainable projection model
🗂️ Retrieval-Augmented Generation (RAG)

🧮 Implemented Metrics

🔠 Distinctness
🌈 Diversity
🔵 Self-BLEU
➗ KL-divergence
🏆 Reward
📏 Length
🌀 Perplexity
🌟 METEOR
🔍 Retrieval Utility

🤖 How to Use

Turbo-Alignment offers an intuitive interface for training and aligning large language models. Refer to the detailed examples and configuration files in the documentation to get started quickly with your specific use case. User-friendly guid available here.

The most crucial aspect is to prepare the dataset in the required format, after which the pipeline will handle everything automatically. Examples of datasets are available here.

Table of use-cases

Training
Inference
Sampling
- Random
- RM
- RSO
Common
- Preprocess
- Merge adapters to base

Train

Supervised Fine-Tuning

📚 Dataset type prepare your dataset in the ChatDataset, examples available here format.
📝 Configs Example: sft.json
🖥️ CLI launch command

python -m turbo_alignment train_sft --experiment_settings_path configs/exp/train/sft/sft.json

Preference Tuning

Reward Modeling

📚 Dataset type prepare your dataset in the PairPreferencesDataset format, examples available here
📝 Configs Example: rm.json
🖥️ CLI launch command

python -m turbo_alignment train_rm --experiment_settings_path configs/exp/train/rm/rm.json

DPO, IPO, CPO, KTO (Paired)

📚 Dataset type prepare your dataset in the PairPreferencesDataset format, examples available here
📝 Configs Example: dpo.json
🖥️ CLI launch command

python -m turbo_alignment train_dpo --experiment_settings_path configs/exp/train/dpo/dpo.json

KTO (Unpaired)

📚 Dataset type prepare your dataset in the KTODataset format, examples available here
📝 Configs Examples: kto.json
🖥️ CLI launch command

python -m turbo_alignment train_kto --experiment_settings_path configs/exp/train/kto/kto.json

⌛️ in progress..

Multimodal Tasks

To start multimodal training, you should:

Prepare the multimodal dataset. See examples here.
Preprocess the data (OPTIONAL). If you plan to run many experiments on the same dataset, you should preprocess it. The preprocessing stage includes reading pixel_values from images, encoding them with the specified encoder, and saving them in safetensors format. Later, during training, the pipeline will skip the stage of reading and encoding images and only extract prepared encodings from the safetensors files.
Suitable config: llava.json,c_abs.json

⌛️ in progress..

RAG

To launch RAG:

Choose a base encoder, create a document index.
For end-to-end:
- Train both the retriever and the generator.
- Prepare the data in "dataset_type": "chat" with query -> response.
- Suitable config: end2end_rag
For sft-rag:
- Train only generator
- Prepare the data in "dataset_type": "chat" with query+retrieved_documents -> response.
- Suitable config: sft_with_retrieval_utility

Inference

⌛️ in progress..

Sampling

⌛️ in progress..

Common

⌛️ in progress..

🚀 Installation

📦 Python Package

pip install turbo-alignment

🛠️ From Source

For the latest features before an official release:

pip install git+https://github.com/turbo-llm/turbo-alignment.git

📂 Repository

Clone the repository for access to examples:

git clone https://github.com/turbo-llm/turbo-alignment.git

🌱 Development

Contributions are welcome! Read the contribution guide and set up the development environment:

git clone https://github.com/turbo-llm/turbo-alignment.git
cd turbo-alignment
poetry install

📍 Library Roadmap

Increasing number of tutorials
Enhancing test coverage
Implementation of Online RL methods like PPO and Reinforce
Facilitating distributed training
Incorporating low-memory training approaches

❓ FAQ

How do I install Turbo-Alignment?

See the Installation section for detailed instructions.

Where can I find docs?

Guides and docs are available here.

Where can I find tutorials?

Tutorials are available here.

📝 License

This project is licensed, see the LICENSE file for details.

turbo-llm/turbo-alignment

🚀 Turbo-Alignment

Table of Contents

🌟 What is Turbo-Alignment?

✨ Key Features

🛠️ Supported Methods

🧮 Implemented Metrics

🤖 How to Use

Table of use-cases

Train

Supervised Fine-Tuning

Preference Tuning

Reward Modeling

DPO, IPO, CPO, KTO (Paired)

KTO (Unpaired)

Multimodal Tasks

RAG

Inference

Sampling

Common

🚀 Installation

📦 Python Package

🛠️ From Source

📂 Repository

🌱 Development

📍 Library Roadmap

❓ FAQ

How do I install Turbo-Alignment?

Where can I find docs?

Where can I find tutorials?

📝 License