/DDPG-HER

Implementation of the Deep Deterministic Policy Gradient and Hindsight Experience Replay.

Primary LanguagePython

DDPG + HER

Implementation of the Deep Deterministic Policy Gradient with Hindsight Experience Replay Extension on the MuJoCo's robotic FetchPickAndPlace environment.

Visit vanilla_DDPG branch for the implementation without the HER extention.

Dependencies

  • gym == 0.17.2
  • matplotlib == 3.1.2
  • mpi4py == 3.0.3
  • mujoco-py == 2.0.2.13
  • numpy == 1.19.1
  • opencv_contrib_python == 3.4.0.12
  • psutil == 5.4.2
  • torch == 1.4.0

Installation

pip3 install -r requirements.txt

Usage

mpirun -np $(nproc) python3 -u main.py

Demo

Result

Reference

  1. Continuous control with deep reinforcement learning, Lillicrap et al., 2015
  2. Hindsight Experience Replay, Andrychowicz et al., 2017
  3. Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research, Plappert et al., 2018

Acknowledgement

All the credit goes to @TianhongDai for his simplified implementation of the original OpenAI's code.