This GIT contains the implementations of the ICML 2020 paper "Deep Coordination Graphs" by Wendelin Böhmer, Vitaly Kurin and Shimon Whiteson. The code is based on the python/pytorch framework PyMARL from the Starcraft Multi-Agent Challenge. For questions and comments please contact Wendelin Böhmer.
Follow the first 2 steps from PyMARL :
- Build the Dockerfile using
cd docker
bash build.sh
- Set up StarCraft II and SMAC:
bash install_sc2.sh
This will download SC2 into the 3rdparty folder and copy the maps necessary to run over.
Finally, copy the StarCraft II maps used in the DCG paper:
cp -f maps/* ./3rdparty/StarCraftII/Maps/Melee
Both algorithms and environments can be used with existing versions of PyMARL as well.
The Dockerfile must install the torch_scatter
library, which can lead to conflicts with the installed torch
version and the Ubuntu version of the base image. See docker/Dockerfile
.
The environments can be imported by copying:
src/config/envs/rel_overgen.yaml
src/config/envs/ghost_hunt.yaml
src/envs/stag_hunt.py
and by appending src/envs/__init__.py
with:
from .stag_hunt import StagHunt
REGISTRY["stag_hunt"] = partial(env_fn, env=StagHunt)
The algorithms can be imported by copying:
src/config/algs/cg.yaml
src/config/algs/dcg.yaml
src/config/algs/dcg_noshare.yaml
src/config/algs/lrq.yaml
src/controllers/cg_controller.py
src/controllers/dcg_controller.py
src/controllers/dcg_noshare_controller.py
src/controllers/low_rank_controller.py
src/learners/dcg_learner.py
src/modules/agents/rnn_feature_agent.py
by appending src/controllers/__init__.py
with:
from .dcg_controller import DeepCoordinationGraphMAC
REGISTRY["dcg_mac"] = DeepCoordinationGraphMAC
from .dcg_noshare_controller import DCGnoshareMAC
REGISTRY["dcg_noshare_mac"] = DCGnoshareMAC
from .cg_controller import SimpleCoordionationGraphMAC
REGISTRY["cg_mac"] = SimpleCoordionationGraphMAC
from .low_rank_controller import LowRankMAC
REGISTRY["low_rank_q"] = LowRankMAC
by appending src/learners/__init__.py
with:
from .dcg_learner import DCGLearner
REGISTRY["dcg_learner"] = DCGLearner
and finally by appending src/modules/agents/__init__.py
with:
from .rnn_feature_agent import RNNFeatureAgent
REGISTRY["rnn_feat"] = RNNFeatureAgent
As in the PyMARL framework, all experiments are run like this:
python3 src/main.py --config=$ALG --env-config=$ENV with $PARAMS
The experiments should be run in a Docker container, to avoid installing complicated dependencies:
bash run.sh $GPU python3 src/main.py --config=$ALG --env-config=$ENV with $PARAMS
The sacred
logs containing the results will be stored as json
files in the results
folder.
Algorithms | $ALG |
$PARAMS |
Comment |
---|---|---|---|
DCG | dcg |
DCG w/o low-rank approximation | |
DCG-S | dcg |
duelling=True |
DCG with privileged bias function |
DCG (rank $K ) |
dcg |
cg_payoff_rank=$K |
DCG with low-rank approximation |
DCG (nps) | dcg_noshare |
DCG w/o parameter sharing | |
CG | cg |
DCG (nps) with central observations | |
QTRAN | qtran |
from PyMARL | |
QMIX | qmix |
from PyMARL | |
IQL | iql |
from PyMARL | |
VDN (pymarl) | vdn |
from PyMARL | |
VDN (dcg) | dcg |
cg_edges=vdn |
VDN using DCG classes |
VDN-S | dcg |
cg_edges=vdn duelling=True |
VDN with privileged bias function |
LRQ (rank $K ) |
lrq |
low_rank=$K |
Low-rank joint Q-value |
All DCG variants and CG can have the following topologies:
Topologies | $ALG |
$PARAMS |
---|---|---|
DCG | dcg /dcg_noshare /cg |
|
CYCLE | dcg /dcg_noshare /cg |
cg_edges=cycle |
LINE | dcg /dcg_noshare /cg |
cg_edges=line |
STAR | dcg /dcg_noshare /cg |
cg_edges=star |
VDN | dcg /dcg_noshare /cg |
cg_edges=vdn |
$N rand. edges |
dcg /dcg_noshare /cg |
cg_edges=$N |
given topology | dcg /dcg_noshare /cg |
cg_edges=$LIST |
$LIST
must be a list of tuples of node indices (starting with 0), for example, $LIST="[(0,1),(1,2),(2,3),(3,0)]"
for a cycle of 4 agents.
Experiment | $ENV |
$PARAMS |
Task |
---|---|---|---|
Fig. 2a | rel_overgen |
env_args.miscapture_punishment=0 |
Relative overgeneralization |
Fig. 2b | rel_overgen |
env_args.miscapture_punishment=-1 |
Relative overgeneralization |
Fig. 2c | rel_overgen |
env_args.miscapture_punishment=-1.25 |
Relative overgeneralization |
Fig. 2d | rel_overgen |
env_args.miscapture_punishment=-1.5 |
Relative overgeneralization |
Fig. 3a-c | rel_overgen |
env_args.miscapture_punishment=-2 |
Relative overgeneralization |
Fig. 4a-c | ghost_hunt |
Artificial decentralization | |
Fig. 5, 8a | sc2 |
env_args.map_name=MMM2 |
StarCraft II |
Fig. 8b | sc2 |
env_args.map_name=so_many_baneling |
StarCraft II |
Fig. 8c | sc2 |
env_args.map_name=8m_vs_9m |
StarCraft II |
Fig. 8d | sc2 |
env_args.map_name=3s_vs_5z |
StarCraft II |
Fig. 8e | sc2 |
env_args.map_name=3s5z |
StarCraft II |
Fig. 8f | sc2 |
env_args.map_name=micro_focus |
StarCraft II |
If you use DCG in your research, or any other implementation provided here that is not included in PyMARL, please cite the DCG paper:
W. Böhmer, V. Kurin and S. Whiteson. Deep Coordination Graphs, to appear at the International Conference on Machine Learning (ICML), 2020. URL: https://arxiv.org/abs/1910.00091
In BibTeX format:
@InProceedings{boehmer2020dcg,
title = {Deep Coordination Graphs},
author = {Wendelion B\"ohmer and Vitaly Kurin and Shimon Whiteson},
booktitle = {International Conference on Machine Learning},
url = {https://arxiv.org/abs/1910.00091},
year = {2020},
}
As DCG uses PyMARL, you should also cite the SMAC paper:
M. Samvelyan, T. Rashid, C. Schroeder de Witt, G. Farquhar, N. Nardelli, T.G.J. Rudner, C.-M. Hung, P.H.S. Torr, J. Foerster, S. Whiteson. The StarCraft Multi-Agent Challenge, CoRR abs/1902.04043, 2019.
In BibTeX format:
@article{samvelyan19smac,
title = {{The} {StarCraft} {Multi}-{Agent} {Challenge}},
author = {Mikayel Samvelyan and Tabish Rashid and Christian Schroeder de Witt and Gregory Farquhar and Nantas Nardelli and Tim G. J. Rudner and Chia-Man Hung and Philiph H. S. Torr and Jakob Foerster and Shimon Whiteson},
journal = {CoRR},
volume = {abs/1902.04043},
year = {2019},
}
Code licensed under the Apache License v2.0