/prop

A library of Reinforcment Learning agents

Primary LanguagePythonMIT LicenseMIT

<abstractpaper>

prop

prop is a library of Reinforcment Learning agents implemented in pytorch.

Algorithms

Model Policy
DQN Model-Free Off-Policy
A2C Model-Free On-Policy

DQN

Deep Q-Learning is a variant of Q-learning with a deep neural network used for estimating Q-values (hence DQN; Deep Q-Network).

Both DQN and DDQN (Double DQN) are implemented.

A2C

Advantage Actor Critic is a variant of Actor-Critic that:

  • Uses a neural network to approximate a policy and a value function.
  • Computes the advantage of an action to scale the computed gradients. This acts as a vote of confidence (or skepticism) on actions produced by the actor.