Reinforcement Learning with TensorFlow

An attempt at implementing a DQN similar to DeepMind's DQN algorithm from the papers "Playing Atari with Deep Reinforcement Learning" and "Human-level control through deep reinforcement learning" using TensorFlow.

Special thanks to David Silver from DeepMind for his course on Reinforcement Learning.

This is continuing work in progress.

R(s, a) =
      +1 if the agent has won or achieved some predefined goal as a result of the action a.
            in state s (e.g. snake game -> snake just ate some food)
      -1 if agent has lost as a result of the action a in state s
            (e.g. snake game -> snake went out of bounds or ran into itself)
        -0.01 otherwise

Run with the --help flag to see possible command line args.

Video: After 10 hours of training (on a cpu... ):

Score distribution after 10 hours of training:

Score distribution after 7 hours of training:

Initial Score Distribution (random actions):

More soon.

elvisun/reinforcement-learning

Reinforcement Learning with TensorFlow

Video: After 10 hours of training (on a cpu... ):

Score distribution after 10 hours of training:

Score distribution after 7 hours of training:

Initial Score Distribution (random actions):