/mate

MATE: the Multi-Agent Tracking Environment.

Primary LanguagePythonMIT LicenseMIT

MATE: the Multi-Agent Tracking Environment

This repo contains the source code of MATE, the Multi-Agent Tracking Environment. The full documentation can be found at https://mate-gym.readthedocs.io. The full list of implemented agents can be found in section Implemented Algorithms. For detailed description, please checkout our paper (PDF, bibtex).

This is an asymmetric two-team zero-sum stochastic game with partial observations, and each team has multiple agents (multiplayer). Intra-team communications are allowed, but inter-team communications are prohibited. It is cooperative among teammates, but it is competitive among teams (opponents).

Installation

git config --global core.symlinks true  # required on Windows
pip3 install git+https://github.com/XuehaiPan/mate.git#egg=mate

NOTE: Python 3.7+ is required, and Python versions lower than 3.7 is not supported.

It is highly recommended to create a new isolated virtual environment for MATE using conda:

git clone https://github.com/XuehaiPan/mate.git && cd mate
conda env create --no-default-packages --file conda-recipes/basic.yaml  # or full-cpu.yaml to install RLlib
conda activate mate

Getting Started

Make the MultiAgentTracking environment and play!

import mate

# Base environment for MultiAgentTracking
env = mate.make('MultiAgentTracking-v0')
env.seed(0)
done = False
camera_joint_observation, target_joint_observation = env.reset()
while not done:
    camera_joint_action, target_joint_action = env.action_space.sample()  # your agent here (this takes random actions)
    (
        (camera_joint_observation, target_joint_observation),
        (camera_team_reward, target_team_reward),
        done,
        (camera_infos, target_infos)
    ) = env.step((camera_joint_action, target_joint_action))

Another example with a built-in single-team wrapper (see also Built-in Wrappers):

import mate

env = mate.make('MultiAgentTracking-v0')
env = mate.MultiTarget(env, camera_agent=mate.GreedyCameraAgent(seed=0))
env.seed(0)
done = False
target_joint_observation = env.reset()
while not done:
    target_joint_action = env.action_space.sample()  # your agent here (this takes random actions)
    target_joint_observation, target_team_reward, done, target_infos = env.step(target_joint_action)

Screencast
4 Cameras vs. 8 Targets (9 Obstacles)

Examples and Demos

mate/evaluate.py contains the example evaluation code for the MultiAgentTracking environment. Try out the following demos:

# <MultiAgentTracking<MultiAgentTracking-v0>>(4 cameras, 2 targets, 9 obstacles)
python3 -m mate.evaluate --episodes 1 --config MATE-4v2-9.yaml

# <MultiAgentTracking<MultiAgentTracking-v0>>(4 cameras, 8 targets, 9 obstacles)
python3 -m mate.evaluate --episodes 1 --config MATE-4v8-9.yaml

# <MultiAgentTracking<MultiAgentTracking-v0>>(8 cameras, 8 targets, 9 obstacles)
python3 -m mate.evaluate --episodes 1 --config MATE-8v8-9.yaml

# <MultiAgentTracking<MultiAgentTracking-v0>>(4 cameras, 8 targets, 0 obstacle)
python3 -m mate.evaluate --episodes 1 --config MATE-4v8-0.yaml

# <MultiAgentTracking<MultiAgentTracking-v0>>(0 camera, 8 targets, 32 obstacles)
python3 -m mate.evaluate --episodes 1 --config MATE-Navigation.yaml
4 Cameras
vs. 2 Targets
(9 obstacles)
4 Cameras
vs. 8 Targets
(9 obstacles)
8 Cameras
vs. 8 Targets
(9 obstacles)
4 Cameras
vs. 8 Targets
(no obstacles)
8 Targets Navigation
(no cameras)

You can specify the agent classes and arguments by:

python3 -m mate.evaluate --camera-agent module:class --camera-kwargs <JSON-STRING> --target-agent module:class --target-kwargs <JSON-STRING>

You can find the example code for agents in examples. The full list of implemented agents can be found in section Implemented Algorithms. For example:

# Example demos in examples
python3 -m examples.naive

# Use the evaluation script
python3 -m mate.evaluate --episodes 1 --render-communication \
    --camera-agent examples.greedy:GreedyCameraAgent --camera-kwargs '{"memory_period": 20}' \
    --target-agent examples.greedy:GreedyTargetAgent \
    --config MATE-4v8-9.yaml \
    --seed 0

Communication

You can implement your own custom agents classes to play around. See Make Your Own Agents for more details.

Environment Configurations

The MultiAgentTracking environment accepts a Python dictionary mapping or a configuration file in JSON or YAML format. If you want to use customized environment configurations, you can copy the default configuration file:

cp "$(python3 -m mate.assets)"/MATE-4v8-9.yaml MyEnvCfg.yaml

Then make some modifications for your own. Use the modified environment by:

env = mate.make('MultiAgentTracking-v0', config='/path/to/your/cfg/file')

There are several preset configuration files in mate/assets directory.

# <MultiAgentTracking<MultiAgentTracking-v0>>(4 camera, 2 targets, 9 obstacles)
env = mate.make('MATE-4v2-9-v0')

# <MultiAgentTracking<MultiAgentTracking-v0>>(4 camera, 8 targets, 9 obstacles)
env = mate.make('MATE-4v8-9-v0')

# <MultiAgentTracking<MultiAgentTracking-v0>>(8 camera, 8 targets, 9 obstacles)
env = mate.make('MATE-8v8-9-v0')

# <MultiAgentTracking<MultiAgentTracking-v0>>(4 camera, 8 targets, 0 obstacles)
env = mate.make('MATE-4v8-0-v0')

# <MultiAgentTracking<MultiAgentTracking-v0>>(0 camera, 8 targets, 32 obstacles)
env = mate.make('MATE-Navigation-v0')

You can reinitialize the environment with a new configuration without creating a new instance:

>>> env = mate.make('MultiAgentTracking-v0', wrappers=[mate.MoreTrainingInformation])  # we support wrappers
>>> print(env)
<MoreTrainingInformation<MultiAgentTracking<MultiAgentTracking-v0>>(4 cameras, 8 targets, 9 obstacles)>

>>> env.load_config('MATE-8v8-9.yaml')
>>> print(env)
<MoreTrainingInformation<MultiAgentTracking<MultiAgentTracking-v0>>(8 cameras, 8 targets, 9 obstacles)>

Besides, we provide a script mate/assets/generator.py to generate a configuration file with responsible camera placement:

python3 -m mate.assets.generator --path 24v48.yaml --num-cameras 24 --num-targets 48 --num-obstacles 20

See Environment Customization for more details.

Built-in Wrappers

MATE provides multiple wrappers for different settings. Such as fully observability, discrete action spaces, single team multi-agent, etc. See Built-in Wrappers for more details.

Wrapper Description
observation EnhancedObservation Enhance the agent’s observation, which sets all observation mask to True.
SharedFieldOfView Share field of view among agents in the same team, which applies the or operator over the observation masks. The target agents share the empty status of warehouses.
MoreTrainingInformation Add more environment and agent information to the info field of step(), enabling full observability of the environment.
RescaledObservation Rescale all entity states in the observation to [-1, +1].
RelativeCoordinates Convert all locations of other entities in the observation to relative coordinates.
action DiscreteCamera Allow cameras to use discrete actions.
DiscreteTarget Allow targets to use discrete actions.
reward AuxiliaryCameraRewards Add additional auxiliary rewards for each individual camera.
AuxiliaryTargetRewards Add additional auxiliary rewards for each individual target.
single-team MultiCamera Wrap into a single-team multi-agent environment.
MultiTarget
SingleCamera Wrap into a single-team single-agent environment.
SingleTarget
communication MessageFilter Filter messages from agents of intra-team communications.
RandomMessageDropout Randomly drop messages in communication channels.
RestrictedCommunicationRange Add a restricted communication range to channels.
NoCommunication Disable intra-team communications, i.e., filter out all messages.
ExtraCommunicationDelays Add extra message delays to communication channels.
miscellaneous RepeatedRewardIndividualDone Repeat the reward field and assign individual done field of step(), which is similar to MPE.

You can create an environment with multiple wrappers at once. For example:

env = mate.make('MultiAgentTracking-v0',
                wrappers=[
                    mate.EnhancedObservation,
                    mate.MoreTrainingInformation,
                    mate.WrapperSpec(mate.DiscreteCamera, levels=5),
                    mate.WrapperSpec(mate.MultiCamera, target_agent=mate.GreedyTargetAgent(seed=0)),
                    mate.RepeatedRewardIndividualDone,
                    mate.WrapperSpec(mate.AuxiliaryCameraRewards,
                                     coefficients={'raw_reward': 1.0,
                                                   'coverage_rate': 1.0,
                                                   'soft_coverage_score': 1.0,
                                                   'baseline': -2.0}),
                ])

Implemented Algorithms

The following algorithms are implemented in examples:

NOTE: all learning-based algorithms are tested with Ray 1.12.0 on Ubuntu 20.04 LTS.

Citation

If you find MATE useful, please consider citing:

@inproceedings{pan2022mate,
  title     = {{MATE}: Benchmarking Multi-Agent Reinforcement Learning in Distributed Target Coverage Control},
  author    = {Xuehai Pan and Mickel Liu and Fangwei Zhong and Yaodong Yang and Song-Chun Zhu and Yizhou Wang},
  booktitle = {Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
  year      = {2022},
  url       = {https://openreview.net/forum?id=SyoUVEyzJbE}
}

License

MIT License