pysc2-agent

This project is a re-implementation of paper "StarCraft II: A New Challenge for Reinforcement Learning". The learning algorithm is A2C and both Atari-net and FullyConv are implemented. Furthermore, The atari-net can be trained in auto-regressive manner.

Requirments

python 3 or above
pysc2 2.0.1
tensorflow or tensorflow-gpu

TODOs

Load trained models
Build FullyConv in auto-regressive manner
Save videos of evaluation epsiodes (New feature of pysc2 2.0)

Run an agent

Run a agent in MoveToBeacon mini-game with 1e-4 learning rate.

python3 run_sc2.py --map MoveToBeacon --ent_coef 1e-3 --lr 1e-4 --num_timesteps 600000 --num_cpu 8 --vl_coef 1.0 --max_grad_norm 0.5 --network atari --ar

Arguments

--map: The map you want to train on.
--ent_coef: Entropy coeffient.
--vl_coef: Weight of value loss.
--lr: Learning rate.
--optimizer: Optimizer for updating the weights. Available options: rmsprop and adam
--num_timesteps: Total training steps.
--num_cpu: Number of enviornments to run simultaneously.
--max_grad_norm: Max gradient norm, for gradient clipping.
--network: Type of network. The available options are: atari and fullyconv
--ar: This is a boolean. Decide whether to build the network in auto-regressive manner. (This is only available when using Atari-net)

Performance

Best mean scores: