An attempt at implementing a DQN similar to DeepMind's DQN algorithm from the papers "Playing Atari with Deep Reinforcement Learning" and "Human-level control through deep reinforcement learning" using TensorFlow.
Special thanks to David Silver from DeepMind for his course on Reinforcement Learning.
This is continuing work in progress.
R(s, a)
=
+1 if the agent has won or achieved some predefined goal as a result of the action a
.
in state s
(e.g. snake game -> snake just ate some food)
-1 if agent has lost as a result of the action a
in state s
(e.g. snake game -> snake went out of bounds or ran into itself)
-0.01 otherwise
Run with the --help
flag to see possible command line args.
More soon.