[Question] Help to understand PettingZoo + SuperSuit + StableBaselines3 approach

Question

[Question] Help to understand PettingZoo + SuperSuit + StableBaselines3 approach

Closed this issue a year ago · 4 comments

Question

Hi everyone,

I have successfully trained a simple multi-agent game environment using Stable Baselines 3 + PettingZoo + SuperSuit. Surprisingly, all of the agents learn incredibly well using a single agent interface as stable baselines 3 is.

Now, my question is: I don't really get the classification of this algorithm. Is it an example of "joint action learning" or "centralized training and decentralized execution"?

I have been following this tutorial which is also available on PettingZoo examples: https://towardsdatascience.com/multi-agent-deep-reinforcement-learning-in-15-lines-of-code-using-pettingzoo-e0b963c0820b

Unfortunately, SuperSuit doesn't seem to provide a detailed explanation of its workflow. It seems like observation and chosen actions are stacked together, so I tend to think that it's a joint action learning implementation.

Thank you in advance!

Answer 1 · 2023-08-28T15:12:07.000Z

That towards datascience article is out of date, the current SB3 tutorials are basically updated versions of that. I'll ask Jordan to update the post with links to the new tutorials.

And about the algorithm I believe it's parameter sharing, as it's the same model for all agents. I believe for centralized training it usually means there's additional information about the global state (link), where here each agent gets only its own observation and no additional information.

Glad to hear the tutorials worked well for you, I was honestly kind of blown away how well it worked right out of the gate without even any hyperparameter tuning. Very happy that we finally have training tutorials that just seem to work and aren't overly complicated like RLlib or having to manually change the input dimensions and fiddle around to get CleanRL to work, or Tianshou seeming to not work that well.

Answer 2 · 2023-08-28T15:24:09.000Z

Thank you very much @elliottower for your answer and your information.
Surprisingly, I managed to use the PettingZoo parallel API to train drone swarms using a simulated environment. I have also extended the SuperSuit package to support multiple observations (in my case, RGB camera + position). I will see if I can fix the code and ask for a pull request. Thank you again.

Answer 3 · 2023-08-28T15:58:14.000Z

Wow very cool, if you want you could add the environment to our third party environments list, and include a training script on your repo so people can use it as a reference. And a PR for supersuit sounds good.

Answer 4 · 2023-09-13T14:05:04.000Z

Let me know if you have any more questions or help with the PR you mentioned for SuperSuit. Feel free to re open, just doing housekeeping