/qwop_RL

QWOP agent

Primary LanguagePython

qwop_RL

It is one of the hardest games in the world. You can try on http://www.foddy.net/Athletics.html. Our objective is to:

  1. train up a model to complete 100m in a reasonable time
  2. improve the speed of the ragdoll to a competitive level
  3. stablize the running pattern of the agent
  4. create the best record on our own

Setup Procedures

  1. Run pip install -r requirements.txt
  2. Make sure you have installed chromedriver. If no, https://chromedriver.chromium.org/downloads. Store it into the ~/qwop_RL dir.
  3. Create a terminal to host game: python host_game.py
  4. Create another terminal to train the agent. You can train either on CPU or GPU environment.

Proposed Methodology

image

Reward Design

image

Training Flowchart

Our approach is through DQN and DDQN to train our agent to complete the 100m racing game. Following diagram illustrates the entire process of updating Q network for approximating the value function by DQN with Target Network as well as DDQN. They share the same updating approach except the way they calculate for TD target.

ddqn training flow

Training

Before training the agent, you can configure the training parameters as well as the action/state/reward design. You can choose to run the training by:

  1. Deep Q Network with Target Network: python dqn_main.py --train, or python dqn_main.py --retrain
  2. Double Deep Q Network: python ddqn_train.py

Testing

  1. Deep Q Network with Target Network: python dqn_main.py --test
  2. Double Deep Q Network: python ddqn_test.py

Videos of Performance

  • First Running Form Trained (5~10 min):

ddqn

dqn_usagii

Credits