This is a pytorch implementation of the multi-agent reinforcement learning algorithms, QMIX, VDN and COMA, which are the state of art MARL algorithms. We trained these algorithms on SMAC, the decentralised micromanagement scenario of StarCraft II.
- QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
- Value-Decomposition Networks For Cooperative Multi-Agent Learning
- Counterfactual Multi-Agent Policy Gradients
$ python main.py --evaluate_epoch=100 --map=3m --alg=qmix
Directly run the main.py, then the algorithm will be tested on map '3m' for 100 episodes, using the pretrained model.
Although QMIX, VDN and COMA are the state of art multi-agent algorithms, they are unstable sometimes. If you want the same results as in the papers, you need to independently run several times(more than 10) and take the median or mean of them.