Jiayu Chen, Yuanxin Zhang, Yuanfan Xu, Huimin Ma, Huazhong Yang, Jiaming Song, Yu Wang, Yi Wu.
Website: https://sites.google.com/view/vacl-neurips-2021
This repository implements a curriculum learning algorithm, Variational Automatic Curriculum Learning (VACL), for solving challenging goal-conditioned cooperative multi-agent reinforcement learning problems. The implementation in this repositorory is used in the paper "Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems" (https://arxiv.org/abs/2111.04613). This repository is heavily based on https://github.com/marlbenchmark/on-policy.git.
test on CUDA == 10.0
git clone https://github.com/jiayu-ch15/Variational-Automatic-Curriculum-Learning.git
cd ~/Variational-Automatic-Curriculum-Learning
conda create -n VACL python==3.6.2
conda activate VACL
pip install torch==1.5.1+cu101 torchvision==0.6.1+cu101 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt
conda activate VACL
cd scripts
# run mpe without entity progression
sh train_mpe_woEP.sh
# run mpe with entity progression
sh train_mpe_EP.sh
Cooperative scenarios:
- simple_spread
- push_ball
- hard_spread
-
Obtain a 30-day free trial on the MuJoCo website or free license if you are a student.
-
Download the MuJoCo version 2.0 binaries for Linux.
-
Unzip the downloaded
mujoco200_linux.zip
directory into~/.mujoco/mujoco200
, and place your license key at~/.mujoco/mjkey.txt
. -
Add this to your
.bashrc
and source your.bashrc
.
export LD_LIBRARY_PATH=~/.mujoco/mujoco200/bin${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export MUJOCO_KEY_PATH=~/.mujoco${MUJOCO_KEY_PATH}
-
You can install mujoco-py by running
pip install mujoco-py==2.0.2.13
. If you encounter some bugs, refer this official repo for help.sudo apt-get install libgl1-mesa-dev libosmesa6-dev
-
To install mujoco-worldgen, follow these steps:
# install mujuco_worldgen
cd envs/hns/mujoco-worldgen/
pip install -e .
pip install xmltodict
# if encounter enum error, excute uninstall
pip uninstall enum34
conda activate VACL
cd scripts
# box locking task
sh train_bl.sh
# hide and seek task
sh train_hns.sh
If you find this repository useful, please cite our paper:
@misc{chen2021variational,
title={Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems},
author={Jiayu Chen and Yuanxin Zhang and Yuanfan Xu and Huimin Ma and Huazhong Yang and Jiaming Song and Yu Wang and Yi Wu},
year={2021},
eprint={2111.04613},
archivePrefix={arXiv},
primaryClass={cs.LG}}