/pytorch-mace

A pytorch implementation of Structured Exploration via Deep Hierarchical Coordination

Primary LanguagePython

An implementation of MACE

1. Introduction

This is a pytorch implementation of Structured Exploration via Deep Hierarchical Coordination.

The experimental environment is a modified version of Waterworld named MAWaterWorld_modified_mixed based on MADRL.

2. Environment

The main features (different from MADRL and that of MADDPG) of the modified Waterworld environment are:

  • evaders and poisons now bounce at the wall obeying physical rules
  • sizes of the evaders, pursuers and poisons are now the same so that random actions will lead to average rewards around 0.
  • need exactly n_coop agents to catch food.
  • discrete actions: up, down, left, right.

3. Dependency

  • pytorch
  • visdom
  • python==3.6.1 (recommend using the anaconda/miniconda)

4. Install

  • Install MADRL.
  • Replace the files in madrl_environments/pursuit directory with the ones in this repo.
  • python main.py will run the training.

5. Results

Two agents, cooperation = 2, evader = 3, poison = 3

The two agents need to cooperate to achieve the food for reward 10.

PNG/233-8tv1.png