does it support mutiple nodes mutiple GPU fine tuning？

Question

does it support mutiple nodes mutiple GPU fine tuning？

qingyuan18 opened this issue 2 years ago · 1 comments

RT, didn't find multiple nodes multiple GPU ‘s fine tuning command guide (for chatglm)

does it use accelerate / deepspeed to do parallel training？
I can see the node_rank parameters in the script:
python3 -m torch.distributed.launch --nproc_per_node 4
--nnodes=1 --node_rank=0 --master_addr=xxx --master_port=yyy
uniform_finetune.py --model_type chatglm --model_name_or_path THUDM/chatglm-6b
--data alpaca-belle-cot --lora_target_modules query_key_value
--lora_r 32 --lora_alpha 32 --lora_dropout 0.1 --per_gpu_train_batch_size 2
--learning_rate 2e-5 --epochs 1

Answer 1 · 2023-05-29T06:27:23.000Z

your can use torchrun
torchrun --nnodes 1 --nproc_per_node $ngpu uniform_finetune.py $args --data $data