/llm3s-conatiner

large language model training-3-stages+deployment

Primary LanguagePython

Complete docs

Install envs

first install pytorch2.0 https://pytorch.org/get-started/locally/ then install others pip install -r requirements.txt

deploy necessary settings

run train SFT model

bash run.sh

run train Reward model

bash run-reward.sh

run train RLHF model

bash run-rlhf.sh

Prepare data

SFT data

refer sft-data-construction

reward data and RLHF data

refer rlhf-ppo