chaoyi-wu/Finetune_LLAMA

如何实现多节点fsdp

boyue-jiang opened this issue · 1 comments

您好,我在论文中看到你们在pretrain阶段用32张卡训练。我想请问如何用trainer fsdp实现多节点训练呢。例如我想在2个节点16个A100上训练,应该怎么用trainer实现,模型是会切片分到16个gpu上吗?

I also met the similar problem. I want to use Pytorch's FSDP to train among muti nodes, but the process blocked. Is there any configuration or example i can follow?