Reproduction of self-play described in paper "Emergent Complexity via Multi-Agent Competition", adapted from PPO2 implementation in OpenAI baselines.
python 3.6
- gym==0.15.4
- mujoco-py==2.1.2.14
- tensorflow==1.15.5
- MuJoCo 2.1
- robosumo:
cd robosumo & pip install -e .
- baselines:
cd baselines & pip install -e .