Finetuning with multiple gpus extemely slow

Question

Finetuning with multiple gpus extemely slow

Closed this issue 8 days ago · 2 comments

Hi, I finetuned a Falcon-7B on one GPU (A10 G) with LoRA, and it was reasonably fast, with an iteration time of approximately 100 ms.

I tried the same with 8 GPUs, using the same configuration, and it is super slow, with an iteration time of 30 seconds! I guess this is not normal. Do you have any idea what may be happening?

My end goal is to finetune a Falcon-40B, but I won't be able to do it at these speeds.

Answer 1 · 2024-06-13T06:03:28.000Z

Ok, It wasn't a problem of the FSDP but of the machine (sagemaker) that I was using, with low inter-gpu speed.

Answer 2 · 2024-06-13T14:53:49.000Z

Glad to hear that you were able to resolve it and it wasn't a bug in litgpt in your case :)