/hiro_pytorch

Implementation of HIRO (Data-Efficient Hierarchical Reinforcement Learning)

Primary LanguagePython

Overview

An implementation of Data-Efficient Hierarchical Reinforcement Learning (HIRO) in PyTorch. Demonstration

Installation

  1. Follow installation of OpenAI Gym Mujoco Installation
1. Obtain a 30-day free trial on the MuJoCo website or free license if you are a student. The license key will arrive in an email with your username and password.
2. Download the MuJoCo version 2.0 binaries for Linux or OSX.
3. Unzip the downloaded mujoco200 directory into ~/.mujoco/mujoco200, and place your license key (the mjkey.txt file from your email) at ~/.mujoco/mjkey.txt.
  1. Install Dependencies
pip install -r requirements.txt

Run

For HIRO,

python main.py --train

For TD3,

python main.py --train --td3

Evaluate Trained Model

Passing --eval argument will read the most updated model parameters and start playing. The goal is to get to the position (0, 16), which is top left corner.

For HIRO,

python main.py --eval

For TD3,

python main.py --eval --td3

Trainining result

Blue is HIRO and orange is TD3

Succss Rate

Success_Rate

Reward

reward_Reward

Intrinsic Reward

reward_Intrinsic_Reward

Losses

Higher Controller Actor
loss_actor_loss_high

Higher Controller Critic
loss_critic_loss_high

Lower Controller Actor
loss_actor_loss_low

Lower Controller Critic
loss_critic_loss_low

TD3 Controller Actor
loss_actor_loss_td3

TD3 Controller Critic
loss_critic_loss_td3