/hindsight-experience-replay-with-demo

PyTorch implementation of the paper Overcoming Exploration in Reinforcement Learning with Demonstrations in surgical robot manipulation tasks.

Primary LanguagePythonMIT LicenseMIT

Hindsight Experience Replay with Demonstrations

PyTorch implementation of the paper Overcoming Exploration in Reinforcement Learning with Demonstrations in surgical robot manipulation tasks.

Acknowledgement

  • OpenAI Baselines for the tensorflow -based implementation.
  • SurRoL for the training and testing simulation platform.
  • DrQv2 for the coding structure and utils modules.

Setup

We use Python 3.8 and Anaconda3 for development. To create an environment and install dependencies, run the following steps:

# Clone and cd into herdemo
git clone https://github.com/TaoHuang13/hindsight-experience-replay-with-demo.git
cd hindsight-experience-replay-with-demo

# Create and activate environment
conda create -n herdemo python=3.8 -y
conda activate herdemo

# Install dependencies
pip install -e .

Then add one line of code in gym/gym/envs/__init__.py to register SurRoL tasks:

import surrol.gym

Run the following command to collect expert demonstration via the scripted policy in the individual task file:

python surrol/data/data_generation.py --env env_name

Here we have already provided demonstrations of several tasks.

Code Navigation

At a high-level, our code relies on the generic python script: train.py for training and evaluating RL agent. We use hydra for hyperparameterize this script with experiment-specific configuration. Specifically, all experiments should be configured in the directory configs/ or command lines.

The rest of code is organized as follows:

  • configs/ config files for launching expriments.
  • rl/ core implementation of HER+DEMO adopted from OpenAI Baselines.
  • surrol/ simulation platform for surgical robotic manipulation based on PyBullet.
  • scripts/ bash scripts to running a batch of experiments.
  • train.py generic python script for training and evaluating RL agent.

To simply start a experiment, run the following command:

sh scripts/run_herdemo.sh