/alpha-terminal

Strong Amateur Play in Terminal Game with Deep Policy Gradient

Primary LanguagePython

Alpha-Terminal

A self-contained framework to train and deploy Reinforcement Learning (Policy Gradient) agents to play the game of Terminal, developed on top of the starter kit.

Read our report of the final performance of the agent we trained (on a single CPU) that achieved strong amateur performance.

demo

About the game: Terminal

Terminal is a two-player zero-sum tower-defense game played on a 28-by-28 board. Each player begins the game with 30 Health Points. The goal is to reduce the opponent’s Health Points to 0 by strategically sending mobile units to achieve touchdowns on the opponent’s edges. In the meanwhile, the player needs to build defense structures to protect her own edge from the opponent’s touchdown.

Training

Our architecture is inspired by the LSTM policy of AlphaStar, which enables the agent to output multiple actions per turn. The agent is trained using vanilla Policy Gradient.

To train the LSTM agent, run

python-algo/run_and_learn.sh

Deployment

To run the agent locally, run

python-algo/run.sh

To see the agent play live, upload the python-algo folder to the Terminal website.