/PYTORCH_RL

Some Reinforcement Learning algorithms in PyTorch.

Primary LanguagePython

Deep Reinforcement Learning with PyTorch

Deep Q-Network (DQN)

This implementation of the Deep Q-Network ("Human-level control through deep reinforcement learning") can be augmented with the following features :

Experiment : CartPole-v1 :

  • Adam
  • learning rate : 1e-4
  • minibatch size : 128
  • replay memory capacity : 25e3
  • prioritized experience replay exponent $\alpha$ : 0.5
  • number of thread/worker : 1
  • double DQN : [x]
  • hindsight experience replay : [ ]

resultDQN1

Deep Deterministic Policy Gradient (DDPG)

This implementation of the Deep Deterministic Policy Gradient ("Continuous Control with Deep Reinforcement Learning") can be augmented with the following features :

Experiment : Pendulum-v0 :

  • Adam
  • learning rate : 1e-4
  • minibatch size : 128
  • soft update $\tau$ : 1e-3
  • replay memory capacity : 1e6
  • prioritized experience replay exponent $\alpha$ : 0.0 (no priority)
  • number of thread/worker : 1
  • hindsight experience replay : [ ]

resultDDPG1

Proximal Policy Optimization (PPO)

This implementation of the "Proximal Policy Optimization Algorithm" can be augmented with the following features :

Experiment : Pendulum-v0 :

  • Adam
  • learning rate : 1e-6
  • minibatch size : 64
  • soft update $\tau$ : 1e-3
  • replay memory capacity : 25e3
  • prioritized experience replay exponent $\alpha$ : 0.0 (no priority)
  • number of thread/worker : 1
  • hindsight experience replay : [ ]

resultPPO1