Safe Deep Reinforcement Learning for Multi-Agent Systems with Continuous Action Spaces
This repository contains the source code and the implementation details for the paper titled Safe Deep Reinforcement Learning for Multi-Agent Systems with Continuous Action Spaces accepted at RL4RealLife @ICML2021
Description
The objective of this approach is to develop a safe variation of the Multiagent Deep Deterministic Policy Gradient (MADDPG). More specifically, the goal is to modify the potentially unsafe MADDPG-based agents' action via projecting it on a safe subspace space using a QP Solver. More details can be found in [1]. This project relies heavily on the OpenAI’s Multi-Agent Particle Environments [2], which is the simulator used to train and evaluate the agents.
Installation
To install and execute the project's source code follow the steps described in the following snippet: WARNING: Installs packages
git clone git@github.com:zisikons/deep-rl.git
cd ./deep-rl
sh install_requirements.sh # installs all the requirements
Alternatively if you don't want to install the new packages on your computer, then make sure to at least:
- Download the code from the forked multiagent-particle-envs
git submodule update --init --recursive
- Download the following python packages:
gym==0.10.5
pyglet==1.3.2
qpsolvers
Execution
Once the code is downloaded and everything is set, in order to train an agent you need to do the following:
python3 scripts/collect_data.py # Uses the simulator to generate the datasets for the
# constraint sensitivity Neural Networks
python3 scripts/train_constraint_networks.py # Trains the constraint sensitivity Neural Networks
# (not required for the vanilla MADDPG agent)
python3 scripts/train_<agent_type>.py # Trains one of the 3 RL agents that were developed
python3 scripts/test_<agent_type>.py # Tests one of the 3 RL agents that were developed
Note: The above sequence takes a considerable amount of time.
Approach Summary
References
[1] Gal Dalal, Krishnamurthy Dvijotham, Matej Vecerik, Todd Hester, Cosmin Paduraru, Yuval Tassa (2018). Safe Exploration in Continuous Action Spaces
[2] Ryan Lowe, Yi Wu, Aviv Tamar, Jean Harb, Pieter Abbeel, Igor Mordatch (2017). Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
Contributors
- Athina Nisioti (@anisioti)
- Dimitris Gkouletsos (@dgkoul)
- Konstantinos Zisis (@zisikons)
- Ziyad Sheebaelhamd (@ziyadsheeba)