/QWOP

Using deep reinforcement learning to beat QWOP

Primary LanguagePython

QWOP

This project aims to use deep reinforcement learning to play the game QWOP.

It is the first in a series of collaboration projects between PTStephD and Kirkados

The Algorithm

The core deep reinforcement learning algorithm is the Distributional Deep Q Learning algorithm, first presented by Bellmare et al. in 2017. A number of enhancements developed by other researchers are used as well. Namely:

Special thanks to:

for publishing their codes! The open-source mindset of AI research is fantastic.

Results

Incentivizing the agent to run down the track (positive rewards are given for forward velocity): https://youtu.be/OYBiUWuA4Ho

Incentivizing the agent to run down the track AND perform front flips: https://youtu.be/16JEWNf6468

Usage

To run the training algorithm, edit settings.py and environment_qwop as appropriate, and then run python3 main.py from a terminal. The default parameters will cause the agent to run down the track, as shown in the above video. The code is CPU-intensive and takes days to train on a modern computer. In addition to python, the following python3 packages must be installed:

  • psutil pip3 install psutil
  • Tensorflow pip3 install tensorflow or pip3 install tensorflow-gpu for GPU compatibility (Additional steps required)
  • box2d pip3 install box2d-py
  • matplotlib pip3 install matplotlib
  • OpenAI gym pip3 install gym[all]
  • virtual display pip3 install pyvirtualdisplay The following linux packages must also be installed:
  • Opengl sudo apt-get install python-opengl
  • xvfb sudo apt-get install xvfb
  • ffmpeg sudo apt-get install ffmpeg

The Environment

A QWOP dynamics environment was developed from first principles and is contained in environment_qwop.py. It consists of a stick figure with a torso, two arms, and two legs. The goal is to press the buttons Q, W, O, and P to make the stick figure translate down the track as fast as possible.