How is multi-agent handled ?
Jogima-cyber opened this issue · 8 comments
Hello there, I was wondering how you handle multi-agent learning. Let's take as an example the supported kaggle hungry-geese environment. There is a function in the environment class :
rule_based_action()
If it exists, does it mean that the trained policy represents 1 player and the three others are rule based with this function ?
So after some research in the code of the library it seems to me that self play (so same net, meaning same policy for the four agents in hungry geese game for example) is default behaviour of Handy RL. I didn't see any use of the rule_based_action function.
Hi, thank you for your consideration.
You are right about self-play.
rule_based_action()
is used in RuleBasedAgent
class in evaluation.py
. This agent is only used for evaluation of trained models (e.g. in the evaluation during python main.py --train
, python main.py --eval
.)
However, RuleBasedAgent
is not set as default agent in the current setting. If you want to evaluate the win-rate against rule-based agent during training, you need to change this line.
And change this line if you want to change the opponent in model evaluation (python main.py --eval
).
And if you want to train against different agents in self-play, you can set for specific opponents of self-play by re-writing generation.py
(worker.py
). (ref. gfootball example)
Multi-agent learning and self play against specific opponents are not supported yet as a default function in HandyRL.
Multi-agent learning must consider the handling of GPU... Further consideration required.
Thank you, very clear answers. Very powerful library because of its scalability and the fact that it goes straight to the point. Thank you also for making this library public. My favorite of all the ones I know. The code is very pleasant to read. I'm going to try it at the kaggle halite playground competition because you kind of already gave a very good model for hungry geese ahaha. But I'll maybe need to make some adjustments to HandyRL to support action embedding, as the action space in Halite is a very large and complex.
@Jogima-cyber
Thank you so much for your kind reply. We made HandyRL public because there were not yet much enough libraries that are easy to use or customize it for real applications, the other libraries have so many rich functions though. We keep in mind to make the code minimum and simple in HandyRL. Minimum code lets the users understand and use it easily. But due to the minimum code, the user who wants to extend the functions needs to implement additional codes like you (this tradeoff is difficult problem for us...)
I’m very looking forward to your nice trial in the Halite. When you will get some results, please tell us that! The practical use case is very useful for everyone.
Thank you!
From my experience, RL libraries with really complex features doesn’t work very well for the moment anyway.
I have another question, but I don’t want to spam your issue channel, so I’ll ask it here : does HandyRL handle one player games ?
Yes. You can easily apply it to one-player games, though several config parameters are not related to one player games.