/LLM-Train-Template

Repository that contains LLM train template that makes LLM training easier

Primary LanguagePythonMIT LicenseMIT

LLM-Train-Template

Repository that contains LLM train template that makes LLM training easier.
Templates will be produced in two versions using Pytorch-Lightning and HuggingFace Trainer, respectively

Now, only Pytorch-Lightning version is available. 🚧

Template can be compatible with Weights and Biases, and DeepSpeed.


Python PyTorch PyTorch Lightning weights and biases Docker

How to use

Basically, These templates are focused on CLI environment.
so you can change hyperparameter of model by argparse and shell script.
For each template, there is an start_train.sh.
You can run the template by running this script.

>>> sh start_train.sh

in start_train.sh file, you can fix the hyperparameter of model.

python train.py --model_name  "type your own value" \
                --tokenizer_path "type your own value" \
                --train_data_path "type your own value" \
                --valid_data_path "type your own value" \
                --wandb_project "type your own value" \
                --batch_size 16 \
                --device 0 \
                --max_source_length 512 \
                --vocab_size 64100
                

🚨 Also, some modifications in template are needed to fit each environment and task 🚨

Environment

This template is tested on this docker image.

However, for users who do not use docker, the version of the core libraries is specified below (python 3.9.13)

deepspeed==0.6.4
huggingface-hub==0.14.1
numpy==1.23.0
pandas==1.4.3
sentencepiece==0.1.99
torch==1.10.2+cu111
torcheval==0.0.6
torchmetrics==0.9.2
torchtext==0.11.2
torchtnt==0.1.0
torchvision==0.11.3+cu111
transformers==4.28.1
pytorch-lightning==1.7.0

Models

(Now only pytorch lightning version is available)

  • BERT
  • T5