WIP
CoLLiE provides collaborative and efficient tuning methods for large language models based on DeepSpeed and PyTorch. It primarily includes the following four features:
- Parallelism Strategies
- Data Parallelism
- Pipeline Parallelism
- Tensor Parallelism
- Zero Redundancy Optimizer (ZeRO)
- Models
- Memory-efficient Fine-tuning Methods
- Inplace SGD
- LoRA
- Friendly Usage
CoLLiE has rewritten models using Megatron-LM and Flash Attention, allowing you to enjoy 3D parallelism simply
by modifying config.dp_size
, config.pp_size
, and config.tp_size
(note that the product of these three parallelism sizes should equal # of GPUs).
Moreover, you can choose whether to use Flash Attention by changing config.use_flash
.
To facilitate user convenience, CoLLiE's models also support methods similar to Huggingface's, where you can load weights from HF using model.from_pretrained()
.
If you don't want to write a training loop yourself, CoLLiE provides a trainer. All you need to do is provide the config and dataset to conduct your custom training process.
More examples are available at examples.
CoLLiE offers integration with torchrun and slurm to enable easy launching of jobs on a single or multiple nodes.
git clone https://github.com/NVIDIA/apex.git
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
pip install git+https://github.com/NVIDIA/Megatron-LM.git@main#egg=megatron.core
git clone https://github.com/OpenLMLab/collie.git
cd collie
pip install -r requirements.txt
python setup.py install