Any plan for supporting distributed training?

Question

Any plan for supporting distributed training?

ruizewang opened this issue 2 years ago · 3 comments

Hi arxyzan~
Thanks for your great contribution! I wonder to known is there any plan for supporting distributed training, as pre-training will be too slow with a single GPU?

Answer 1 · 2022-07-09T15:12:32.000Z

Hi Ray
I strongly recommend using the official repo if you want stable results for pretraining. But in case you want to use other encoder backends (which might be a little complicated to do in fairseq) just let me know and I will put it in my to-do list.

Answer 2 · 2022-07-09T15:24:38.000Z

Thanks for your quick response. Your implementation is neat and easy-to-follow, yet the official one provided by fairseq is somehow complicated.
Anyway, thanks for your contribution again.

Answer 3 · 2022-07-09T15:28:13.000Z

Glad to hear this as it's been my primary goal. I'm definitely gonna provide the code for distributed training and mixed precision. Stay tuned!