Training on 8 Nvidia RTX A6000
Opened this issue · 1 comments
Top34051 commented
Hi Authors, thank you so much for your huge contribution!! I'm pretty new to the optimization workarounds for training large models, so I'm struggling to get the training for Llama-7B started on my setup (8 Nvidia RTX A6000s each having 48 GB of GPU memory). What would you recommend changing the optimization config to get the training working in this case? Thank you so much!
Ablustrund commented
Thank you very much for your interest in this project, and I apologize for the delayed reply.
We set zero3 and offload the parameters to the CPU, the bsz is set to 2, and we cost around 54G.