thu-ml/tianshou

how to run RL using multi-nodes in cluster

Opened this issue · 1 comments

How to use RayVecEnv in cluster? I want to run my rl code using multi-nodes training, I'm new to ray, is there some demos scripts?

Hi @HYB777. This is a ray config issue - as long as you configure ray on a multi-node cluster, run ray.init appropriately, and use the RayVecEnv, things should work out.

That being said, I haven't tested personally on a multi-node cluster yet.

Since we're not ray developers, I think this question is outside of the scope for support from the tianshou team. However, if you encounter tianshou specific issues on the cluster, feel free let us know!

Ray has a large community and a lot of documentation, I suggest you start there. If you want to contribute a multi-node running example, I'm happy to review a PR