/StarCraft

Implementations of QMIX, VDN and COMA on SMAC, corresponding papers are "QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning", "Value-Decomposition Networks For Cooperative Multi-Agent Learning", and "Counterfactual Multi-Agent Policy Gradients".

Primary LanguagePython

StarCraft

This is a pytorch implementation of the multi-agent reinforcement learning algorithms, QMIX, VDN and COMA, which are the state of art MARL algorithms. We trained these algorithms on SMAC, the decentralised micromanagement scenario of StarCraft II.

Corresponding Papers

Requirements

Acknowledgement

Quick Start

$ python main.py --evaluate_epoch=100 --map=3m --alg=qmix

Directly run the main.py, then the algorithm will be tested on map '3m' for 100 episodes, using the pretrained model.

Result

Although QMIX, VDN and COMA are the state of art multi-agent algorithms, they are unstable sometimes. If you want the same results as in the papers, you need to independently run several times(more than 10) and take the median or mean of them.

1. Win Rate of QMIX in Two Independent Runs on '3m'

2. Win Rate of VDN in Two Independent Runs on '3m'

3. Win Rate of COMA in a Run on '3m'