/reinforcement-learning

Reinforcement Learning with TensorFlow

Primary LanguagePythonApache License 2.0Apache-2.0

Reinforcement Learning with TensorFlow

An attempt at implementing a DQN similar to DeepMind's DQN algorithm from the papers "Playing Atari with Deep Reinforcement Learning" and "Human-level control through deep reinforcement learning" using TensorFlow.

Special thanks to David Silver from DeepMind for his course on Reinforcement Learning.

This is continuing work in progress.

R(s, a) =
      +1 if the agent has won or achieved some predefined goal as a result of the action a.
            in state s (e.g. snake game -> snake just ate some food)
      -1 if agent has lost as a result of the action a in state s
            (e.g. snake game -> snake went out of bounds or ran into itself)
        -0.01 otherwise

Run with the --help flag to see possible command line args.

Video: After 10 hours of training (on a cpu... ):

After 10 hours

Score distribution after 10 hours of training:

Score dist. after 10 hours of training

Score distribution after 7 hours of training:

Score dist. after 7 hours of training

Initial Score Distribution (random actions):

Initial Score Distribution (random actions)

More soon.