Code used to fine-tune this model: abacaj/starcoderbase-1b-sft.
Note the data in folder data/
is not the full training data used. You can find the full set here: evol-codealpaca-v1
Install dependencies:
python -m venv env \
&& source env/bin/activate \
&& pip install -r requirements.txt
Run training code:
torchrun --nnodes=1 --nproc-per-node=<REPLACE_WITH_NUMBER_OF_GPUS> train.py
To add data place jsonl files in data/ and edit train.py
line :154
, :155
.
See: wandb