A self-contained framework to train and deploy Reinforcement Learning (Policy Gradient) agents to play the game of Terminal, developed on top of the starter kit.
Read our report of the final performance of the agent we trained (on a single CPU) that achieved strong amateur performance.
Terminal is a two-player zero-sum tower-defense game played on a 28-by-28 board. Each player begins the game with 30 Health Points. The goal is to reduce the opponent’s Health Points to 0 by strategically sending mobile units to achieve touchdowns on the opponent’s edges. In the meanwhile, the player needs to build defense structures to protect her own edge from the opponent’s touchdown.
Our architecture is inspired by the LSTM policy of AlphaStar, which enables the agent to output multiple actions per turn. The agent is trained using vanilla Policy Gradient.
To train the LSTM agent, run
python-algo/run_and_learn.sh
To run the agent locally, run
python-algo/run.sh
To see the agent play live, upload the python-algo
folder to the Terminal website.