/Cluster-MAAC

Code for applying clutering onto MAAC actor-critic algorithm for MARL. Project for CS470 (KAIST, Spring 22').

Primary LanguagePythonMIT LicenseMIT

C-MAAC: Cluster-based Multi-Actor Attention Critic for Reinforcement Learning

Presentation Video | Report

Project for KAIST CS470: Introduction to Artifical Intelligence (Spring 22') by Hyuncheol Park, Taeyeong Lee, Yuseung Lee and Jongjun Park. This work is an extension of the original MAAC (Iqbal et al., ICML 2019) and the baseline codes for MAAC was forked from the official repository.

pipeline

Abstract

Multi-Actor-Attention-Critic (MAAC) proposed an efficient form of an actor-critic algorithm for multi-agent environments by integrating it with the attention mechanism from [10]. In this paper, we test MAAC on two MPE environments Simple Tag and Simple Spread, then propose an extension of this method, C-MAAC, by introducing a new component Cluster Critic into the original critic pipeline. The experimental results show that our method fails to achieve superior results compared to the original MAAC, but we provide an analysis of the possible ways to improve the performance. Code is available at https://github.com/CS470RL/maac-MPE.

Requirements

1. Install the packages in requirement.txt

pip install -r requirements.txt 

2. Install OpenAI Baselines

git clone https://github.com/openai/baselines.git
cd baselines
pip install -e .

3. Install Multi-Agent Particle Environment

git clone https://github.com/shariqiqbal2810/multiagent-particle-envs.git
cd multi-agents-envs
pip install -e .

How to Run

Warning! Running on SSH Server without GUI

  • Write 'xvfb-run -a' before your commands.
xvfb-run -a python run_simple_tag.py ex --clst_ratio 0.5 --use_gpu

1. Run Simple Tag

python run_simple_tag.py test_1
  • Experiment 1. Good=10, Adversary=5, Landmark=0
  • Experiment 2. Good=5, Adversary=10, Landmark=0

2. Run Simple Adversary

python run_simple_adversary.py test_1
  • Experiment 1. Good=10, Adversary=5, Landmarks=3
  • Experiment 2. Good=5, Adversary=10, Landmarks=3