MAMBA

This code accompanies the paper "Scalable Multi-Agent Model-Based Reinforcement Learning".

The repository contains MAMBA implementation as well as fine-tuned hyperparameters in configs/dreamer/optimal folder.

Installation

python3.7 is required

pip install wheel
pip install flatland-2.2.2/
pip install -r requirements.txt

Installing Starcraft:

https://github.com/oxwhirl/smac#installing-starcraft-ii

Usage

python3 train.py --n_workers 2 --env flatland --env_type 5_agents

Two environments are supported for env flag: flatland and starcraft.

Optimal parameters

To train agents with optimal parameters from the paper they should be copied from configs/dreamer/optimal/ folder to DreamerAgentConfig.py and DreamerLearnerConfig.py

SMAC

The code for the environment can be found at https://github.com/oxwhirl/smac

Flatland

The original code for the environment can be found at https://github.com/jbr-ai-labs/NeurIPS2020-Flatland-Competition-Solution

Code Structure

agent contains implementation of MAMBA
- controllers contains logic for inference
- learners contains logic for learning the agent
- memory contains buffer implementation
- models contains architecture of MAMBA
- optim contains logic for optimizing loss functions
- runners contains logic for running multiple workers
- utils contains helper functions
- workers contains logic for interacting with environment
env contains environment logic
networks contains neural network architectures

Citation

@inproceedings{10.5555/3535850.3535894,
      author = {Egorov, Vladimir and Shpilman, Alexei},
      title = {Scalable Multi-Agent Model-Based Reinforcement Learning},
      year = {2022},
      isbn = {9781450392136},
      publisher = {International Foundation for Autonomous Agents and Multiagent Systems},
      address = {Richland, SC},
      booktitle = {Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems},
      pages = {381–390},
      numpages = {10},
      keywords = {communication, multi-agent reinforcement learning, model-based reinforcement learning},
      location = {Virtual Event, New Zealand},
      series = {AAMAS '22}
}

jbr-ai-labs/mamba