This is the code for implementing the MADDPG-based algorithms (Bi-AC, MADDPG) presented in the paper: Bi-level Actor-Critic for Multi-agent Coordination.
It is base on the Multi-Agent Reinforcement Learning Framework: malib.
It is configured to be run in conjunction with a slightly changed environment, original from the highway-env.
-
To install, you need to follow the same routine to install malib.
-
Main dependencies: Python (3.6), OpenAI gym (0.14.0), tensorflow (2.0.0), numpy (1.17.0), matplotlib, pickle.
This is the menu for the matrix game setting shown in the paper. To run the experiment in this menu, run:
cd bilevel_pg/experiments
python run_trainer.py
This is the menu for the highway-env setting shown in the paper. To run the experiment in this menu, run:
cd bilevel_pg_highway_1x1/bilevel_pg
Thus, you enter the menu where all the training code are given, you may any of the algorithms given. For example, for running Bi-AC:
python run_trainer_highway.py
This is the menu where we test the Bi-Q method without neural netowrk. To run the experiment for Bi-Q, run:
cd bully_q
python bilevelq_vs_table_q.py
If you used this code for your experiments or found it helpful, consider citing the following paper:
@article{zhang2019bi, title={Bi-level Actor-Critic for Multi-agent Coordination}, author={Zhang, Haifeng and Chen, Weizhe and Huang, Zeren and Li, Minne and Yang, Yaodong and Zhang, Weinan and Wang, Jun}, journal={arXiv preprint arXiv:1909.03510}, year={2019} }