alpa-projects/alpa

Why did you choose ray instead of using torch distributed?

HongLouyemeng opened this issue · 2 comments

I'm curious why you guys are using ray as a devicemeshQAQ

I have the same question, maybe ray is more flexible as a devicemesh?

ray is more flexible tanks to its actor mode