Finetuning with multiple gpus extemely slow
Closed this issue · 2 comments
SergioG-M commented
Hi, I finetuned a Falcon-7B on one GPU (A10 G) with LoRA, and it was reasonably fast, with an iteration time of approximately 100 ms.
I tried the same with 8 GPUs, using the same configuration, and it is super slow, with an iteration time of 30 seconds! I guess this is not normal. Do you have any idea what may be happening?
My end goal is to finetune a Falcon-40B, but I won't be able to do it at these speeds.
SergioG-M commented
Ok, It wasn't a problem of the FSDP but of the machine (sagemaker) that I was using, with low inter-gpu speed.
rasbt commented
Glad to hear that you were able to resolve it and it wasn't a bug in litgpt in your case :)