This project is a re-implementation of paper "StarCraft II: A New Challenge for Reinforcement Learning". The learning algorithm is A2C and both Atari-net and FullyConv are implemented. Furthermore, The atari-net can be trained in auto-regressive manner.
- python 3 or above
- pysc2 2.0.1
- tensorflow or tensorflow-gpu
- Load trained models
- Build FullyConv in auto-regressive manner
- Save videos of evaluation epsiodes (New feature of pysc2 2.0)
Run a agent in MoveToBeacon mini-game with 1e-4 learning rate.
python3 run_sc2.py --map MoveToBeacon --ent_coef 1e-3 --lr 1e-4 --num_timesteps 600000 --num_cpu 8 --vl_coef 1.0 --max_grad_norm 0.5 --network atari --ar
--map
: The map you want to train on.--ent_coef
: Entropy coeffient.--vl_coef
: Weight of value loss.--lr
: Learning rate.--optimizer
: Optimizer for updating the weights. Available options:rmsprop
andadam
--num_timesteps
: Total training steps.--num_cpu
: Number of enviornments to run simultaneously.--max_grad_norm
: Max gradient norm, for gradient clipping.--network
: Type of network. The available options are:atari
andfullyconv
--ar
: This is a boolean. Decide whether to build the network in auto-regressive manner. (This is only available when using Atari-net)
Best mean scores:
Ours | DeepMind | |
---|---|---|
MoveToBeacon | 25 | 26 |
DefeatRoaches | 91 | 101 |
CollectMineralShards | 104 |