CoLLiE

CoLLiE: Collaborative Tuning of Large Language Models in an Efficient Way

Features
Benchmark
Usage
Installation

Benchmark

Throughput

WIP

Features

CoLLiE provides collaborative and efficient tuning methods for large language models based on DeepSpeed and PyTorch. It primarily includes the following four features:

Parallelism Strategies
- Data Parallelism
- Pipeline Parallelism
- Tensor Parallelism
- Zero Redundancy Optimizer (ZeRO)
Models
- Flash Attention
Memory-efficient Fine-tuning Methods
- Inplace SGD
- LoRA
Friendly Usage

CoLLiE has rewritten models using Megatron-LM and Flash Attention, allowing you to enjoy 3D parallelism simply by modifying config.dp_size, config.pp_size, and config.tp_size (note that the product of these three parallelism sizes should equal # of GPUs). Moreover, you can choose whether to use Flash Attention by changing config.use_flash. To facilitate user convenience, CoLLiE's models also support methods similar to Huggingface's, where you can load weights from HF using model.from_pretrained().

If you don't want to write a training loop yourself, CoLLiE provides a trainer. All you need to do is provide the config and dataset to conduct your custom training process.

Usage

Examples

More examples are available at examples.

Launch Scripts

CoLLiE offers integration with torchrun and slurm to enable easy launching of jobs on a single or multiple nodes.

Installation

1. Apex

git clone https://github.com/NVIDIA/apex.git
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

2. Megatron-LM

pip install git+https://github.com/NVIDIA/Megatron-LM.git@main#egg=megatron.core

3. CoLLie

git clone https://github.com/OpenLMLab/collie.git
cd collie
pip install -r requirements.txt
python setup.py install

00INDEX/collie