sc2atari: A Python repository from pekaalto

Info & References

Here deepmind's sc2 environment is simplified and converted to OpenAI's gym environment so that any existing atari-codes can be applied to simplified sc2-minigames.

The FullyConv -policy (smaller version) from https://deepmind.com/documents/110/sc2le.pdf is implemented and plugged into OpenAI-Baselines a2c implementation.

With this the 3 easiest mini-games can be "solved" quickly.

See also: https://github.com/islamelnabarawy/sc2agents and https://github.com/islamelnabarawy/sc2gym for similar project (stuff here was done independently but later).

Results

Map	Episodes	Avg score	Max score	Deepmind avg	Deepmind max
MoveToBeacon	32*200	25	30	26	45
CollectMineralShards**	32*5000	73	100	103	134
DefeatRoaches**	48*4000	46	260	100	355

**CollectMineralShards and DefeatRoaches performance was still improving slightly

Avg and max are from the last n_envs*100 episodes.
For all maps used the parameters seen in the repo except n_envs=32 (48 in DefeatRoaches).
Episodes is the total number of playing-episodes over all environments.

Deepmind scores are shown for comparison. They are the FullyConv ones reported in the release paper.

How to run

Install the requirements (Baselines etc) below, clone the repo and do

python run_sc2_a2c.py --map_name MoveToBeacon --n_envs 32

This won't save any files. Some results are printed to stdout.

Requirements

Python 3 (will NOT work with python 2)
Open AI's baselines (tested with 0.1.4) (Can also skip the installation and dump the baselines folder inside this repo, most of the dependencies in baselines are not really if use only a2c)
pysc2 (tested with v1.2)
Tensorflow (tested with 1.3.0)
Other standard python packages like numpy etc.

Notes

Here we use only the screen-player-relative observation from the original observation space. Action space is limited only to one action: Select army followed by Attack Move (same for the author when he plays sc2).

With this slice from observation/action space we can make agent to learn the 3 mini-games mentioned above. However for anything more complicated it's not enough.

The action/obs-space limitation makes the problem very much easier, faster and less general/interesting. Because of this and the differences in the network and hyperparamteres the scores are not directly comparable with the release-paper.

The achieved scores here are considerably lower than the Deepmind results which suggests that the limited action space is not enough to achieve optimal performance (e.g micro against roaches or using two marines separately in shards).

pekaalto/sc2atari

Info & References

Results

How to run

Requirements

Notes