/SC2BA

StarCraft+: Benchmarking Multi-Agent Algorithms in Adversary Paradigm

Primary LanguagePythonMIT LicenseMIT

- Please pay attention to the version of SC2 you are using for your experiments. 
- Performance is *not* always comparable between versions. 
- The results in SC2BA (https://arxiv.org/abs/1902.04043) use SC2.4.6.2.69232 not SC2.4.10.

SC2BA

StarCraft+: Benchmarking Multi-Agent Algorithms in Adversary Paradigm

SC2BA is an environment for research in the field of competitive multi-agent reinforcement learning (MARL) based on Blizzard's StarCraft II RTS game. SC2BA makes use of Blizzard's StarCraft II Machine Learning API and DeepMind's PySC2 to provide a convenient interface for autonomous agents to interact with StarCraft II engine, getting observations and performing actions. Unlike the PySC2, SC2BA concentrates on decentralised micromanagement scenarios like SMAC, where each unit of the game is controlled by an individual RL agent. However, the enemies in SMAC are actually controlled by built-in AI with fixed strategies, resulting in a lack of difficulty and challenge in the environment. This deficiency leads to insufficient diversity and generality in algorithm evaluation. SC2BA refresh the benchmarking of MARL algorithms in an adversary paradigm, both multi-agent teams to be controlled by designed MARL algorithms in a continuous adversarial paradigm.

Grounding in SC2BA, we benchmark those classic MARL algorithms in two types of adversarial modes: dual-algorithm paired adversary and multi-algorithm mixed adversary, where the former conducts the adversary of pairwise algorithms while the latter focuses on the adversary to multiple behaviors from a group of algorithms. The extensive benchmark experiments exhibit some thought-provoking observations/problems in the effectivity, sensibility and scalability of these completed algorithms.

Please refer to the accompanying paper for the outline of our motivation for using SC2BA as a testbed for MARL research and the initial experimental results.

About

Together with SMAC we also release APyMARL - our PyTorch framework for adversary MARL research, which includes implementations of several state-of-the-art and classical algorithms, such as DOP, QPLEX, QMIX and COMA.

Should you have any question, please reach to lizishu@njust.edu.cn or commit in issues.

Quick Start

Installing SC2BA

You can install SC2BA by using the following command:

pip install git+https://github.com/dooliu/SC2BA.git

Alternatively, you can clone the SC2BA repository and then install sc2ba with its dependencies:

git clone https://github.com/dooliu/SC2BA.git
pip install -e smac/

NOTE: If you want to extend SC2BA, release please install the package as follows:

git clone https://github.com/dooliu/SC2BA.git
cd smac
pip install -e ".[dev]"
pre-commit install

You may also need to upgrade pip: pip install --upgrade pip for the install to work.

Installing StarCraft II

SC2BA is based on the full game of StarCraft II (versions >= 3.16.1). To install the game, follow the commands bellow.

Linux

Please use the Blizzard's repository to download the Linux version of StarCraft II. By default, the game is expected to be in ~/StarCraftII/ directory. This can be changed by setting the environment variable SC2PATH.

MacOS/Windows

Please install StarCraft II from Battle.net. The free Starter Edition also works. PySC2 will find the latest binary should you use the default install location. Otherwise, similar to the Linux version, you would need to set the SC2PATH environment variable with the correct location of the game.

NOTE: If you are Chinese, due to the Bobby Kotick, CN play cant own their sever again. You can download StarCraft II by this video, and set Battle.net.

SC2BA maps

SC2BA is composed of many combat scenarios with pre-configured maps. Before SMAC can be used, these maps need to be downloaded into the Maps directory of StarCraft II.

Download the SMC2BA_Maps and put it to your $SC2PATH/Maps directory. If you installed SMAC via git, simply copy the SMAC_Maps directory from smac/env/starcraft2/maps/ into $SC2PATH/Maps directory.

List the maps

To see the list of SC2BA maps, together with the number of ally and enemy units and episode limit, run:

python -m sc2ba.bin.map_list 

Creating new maps

We integrate all combat units into one map, allowing any battle scenarios to be implemented within this unified map, while the previous paradigm employs an individual map file for each scene, which makes scene definition/modification tedious and error-prone.

The settings of battle force, multi-agent attributes as well as scene elements are completely formatted with text prompts, thus the operability of scene configuration could be greatly enhanced and the burden of algorithm developers is reduced.

So, we can create a new combat scenario through adding some text prompt in sc2ba_maps file.

"2s3z": {
        "n_agents": 5,
        "n_enemies": 5,
        "limit": 120,
        "a_race": "P",
        "b_race": "P",
        "unit_type_bits": 10,
        "map_type": "stalkers_and_zealots",
        "red_units": {"zealot": 3, "stalker": 2},
        "blue_units": {"zealot": 3, "stalker": 2},
        "red_start_position": (9, 16),
        "blue_start_position": (23, 16),
        "blue_control_models": [0, 1, 2, 3, 4, 5, 6, 7, 8],
        "playable_area": {"lower_left": (0, 8), "upper_right": (32, 24)},
        "mirror_position": True,
        "map_name": "COMMON"
}

NOTE: SC2BA support nine units includes: Marines, Medivac, Marauders, Colossus, Stalkers, Zealots, Zergling, Hydralisk and Baneling.

Testing SC2BA

Please run the following command to make sure that sc2ba and its maps are properly installed.

python -m sc2ba.examples.random_agents

Saving and Watching StarCraft II Replays

Saving a replay

If you’ve using our APyMARL framework for multi-agent RL, here’s what needs to be done:

  1. Saving models: We run experiments on Linux servers with save_model = True (also save_model_interval is relevant) setting so that we have training checkpoints (parameters of neural networks) saved (click here for more details).
  2. Loading models: Learnt models can be loaded using the checkpoint_path parameter. If you run PyMARL on MacOS (or Windows) while also setting save_replay=True, this will save a .SC2Replay file for test_nepisode episodes on the test mode (no exploration) in the Replay directory of StarCraft II. (click here for more details).

If you want to save replays without using PyMARL, simply call the save_replay() function of SMAC's StarCraft2Env in your training/testing code. This will save a replay of all epsidoes since the launch of the StarCraft II client.

The easiest way to save and later watch a replay on Linux is to use Wine.

Watching a replay

You can watch the saved replay directly within the StarCraft II client on MacOS/Windows by clicking on the corresponding Replay file.

Documentation

For the detailed description of the environment, read the SC2BA documentation.

The initial results of our experiments using SC2BA can be found in the accompanying paper.

Code Examples

Below is a small code example which illustrates how SMAC can be used. Here, individual agents execute random policies after receiving the observations and global state from the environment.

If you want to try the state-of-the-art algorithms (such as QMIX and COMA) on SMAC, make use of APyMARL - our framework for MARL research.

from smacbattle.env import StarCraft2Env
import numpy as np


def main():
    map_name = "3m"
    players = []
    map_params = get_map_params(map_name)
    # if play is controlled by built-in AI,class set to Bot, otherwise Agent
    players.append(Agent(map_params["a_race"], Camp.RED.name))
    agent2 = "Agent"
    if agent2 == "Bot":
        players.append(Bot(map_params["b_race"], Camp.BLUE.name + "(Computer)", difficulties["7"]))
    else:
        players.append(Agent(map_params["b_race"], Camp.BLUE.name))
    print(f"players value:{players}")
    env = StarCraft2BAEnv(map_name=map_name, players=players)
    env_info = env.get_env_info()
    print(f"Envs states is:{env_info}")
    # n_actions = env_info["n_actions"]
    n_agents = env_info["n_agents"]
    n_enemies = env_info["n_enemies"]

    n_episodes = 10

    for e in range(n_episodes):
        env.reset()
        terminated = False
        episode_reward = 0

        while not terminated:
            red_state = env.get_state(camp=Camp.RED)
            red_obs = env.get_obs(camp=Camp.RED)
            # print(f"Red Camp get state:{red_state}")
            # print(f"Red Camp get obs:{red_obs}")
            blue_state = env.get_state(camp=Camp.BLUE)
            blue_obs = env.get_obs(camp=Camp.BLUE)
            # print(f"Blue Camp get state:{blue_state}")
            # print(f"Blue Camp get obs:{blue_obs}")
            actions = []
            for agent_id in range(n_agents):
                avail_actions = env.get_avail_agent_actions(agent_id)
                avail_actions_ind = np.nonzero(avail_actions)[0]
                action = np.random.choice(avail_actions_ind)
                actions.append(action)
            for enemy_id in range(n_enemies):
                avail_actions = env.get_avail_agent_actions(enemy_id, camp=Camp.BLUE)
                avail_actions_ind = np.nonzero(avail_actions)[0]
                action = np.random.choice(avail_actions_ind)
                actions.append(action)
            print(f"random_agent, 生成的actions:{actions}")
            reward, terminated, _ = env.step(actions)
            print(f"当前回合奖励值为:{reward}")
            episode_reward += reward[1]
            time.sleep(0.1)

        print("Total reward in episode {} = {}".format(e, episode_reward))

    env.close()