Lightning-AI/litgpt

Finetuning with multiple gpus extemely slow

Closed this issue · 2 comments

Hi, I finetuned a Falcon-7B on one GPU (A10 G) with LoRA, and it was reasonably fast, with an iteration time of approximately 100 ms.

image

I tried the same with 8 GPUs, using the same configuration, and it is super slow, with an iteration time of 30 seconds! I guess this is not normal. Do you have any idea what may be happening?

image

My end goal is to finetune a Falcon-40B, but I won't be able to do it at these speeds.

Ok, It wasn't a problem of the FSDP but of the machine (sagemaker) that I was using, with low inter-gpu speed.

rasbt commented

Glad to hear that you were able to resolve it and it wasn't a bug in litgpt in your case :)