/skimo

Skill-based Model-based Reinforcement Learning (CoRL 2022)

Primary LanguagePython

Skill-based Model-based Reinforcement learning (SkiMo)

[Project website] [Paper] [arXiv]

This project is a PyTorch implementation of Skill-based Model-based Reinforcement Learning, published in CoRL 2022.

Files and Directories

  • run.py: launches an appropriate trainer based on algorithm
  • skill_trainer.py: trainer for skill-based approaches
  • skimo_agent.py: model and training code for SkiMo
  • skimo_rollout.py: rollout with SkiMo agent
  • spirl_tdmpc_agent.py: model and training code for SPiRL+TD-MPC
  • spirl_tdmpc_rollout.py: rollout with SPiRL+TD-MPC
  • spirl_dreamer_agent.py: model and training code for SPiRL+Dreamer
  • spirl_dreamer_rollout.py: rollout with SPiRL+Dreamer
  • spirl_trainer.py: trainer for SPiRL
  • spirl_agent.py: model for SPiRL
  • config/: default hyperparameters
  • calvin/: CALVIN environments
  • d4rl/: D4RL environments forked by Karl Pertsch. The only change from us is in the installation command
  • envs/: environment wrappers
  • spirl/: SPiRL code
  • data/: offline data directory
  • rolf/: implementation of RL algorithms from robot-learning by Youngwoon Lee
  • log/: training log, evaluation results, checkpoints

Prerequisites

  • Ubuntu 20.04
  • Python 3.9
  • MuJoCo 2.1

Installation

  1. Clone this repository.
git clone --recursive git@github.com:clvrai/skimo.git
cd skimo
  1. Create a virtual environment
conda create -n skimo_venv python=3.9
conda activate skimo_venv
  1. Install MuJoCo 2.1
  • Download the MuJoCo version 2.1 binaries for Linux or OSX.
  • Extract the downloaded mujoco210 directory into ~/.mujoco/mujoco210.
  1. Install packages
sh install.sh

Download Offline Datasets

# Navigate to the data directory
mkdir data && cd data

# Maze
gdown 1GWo8Vr8Xqj7CfJs7TaDsUA6ELno4grKJ

# Kitchen (and mis-aligned kitchen)
gdown 1Fym9prOt5Cu_I73F20cdd3lXZPhrvEsd

# CALVIN
gdown 1g4ONf_3cNQtrZAo2uFa_t5MOopSr2DNY

cd ..

Usage

Commands for SkiMo and all baselines. Results will be logged to WandB. Before running the commands below, please change the wandb entity in run.py#L36 to match your account.

Environment

Please replace [ENV] with one of maze, kitchen, calvin. For mis-aligned kitchen, append env.task=misaligned to the downstream RL command. After pre-training, please set [PRETRAINED_CKPT] with the proper path to the checkpoint.

SkiMo (Ours)

  • Pre-training
python run.py --config-name skimo_[ENV] run_prefix=test gpu=0 wandb=true

You can also skip this step by downloading our pre-trained model checkpoints. See instructions in pretrained_models.md.

  • Downstream RL
python run.py --config-name skimo_[ENV] run_prefix=test gpu=0 wandb=true rolf.phase=rl rolf.pretrain_ckpt_path=[PRETRAINED_CKPT]

Dreamer

python run.py --config-name dreamer_config env=[ENV] run_prefix=test gpu=0 wandb=true

TD-MPC

python run.py --config-name tdmpc_config env=[ENV] run_prefix=test gpu=0 wandb=true

SPiRL

  • Need to first pre-train or download the skill prior (see instructions here).
  • Downstream RL
python run.py --config-name spirl_config env=[ENV] run_prefix=test gpu=0 wandb=true

SPiRL+Dreamer

  • Downstream RL
python run.py --config-name spirl_dreamer_[ENV] run_prefix=test gpu=0 wandb=true

SPiRL+TD-MPC

  • Downstream RL
python run.py --config-name spirl_tdmpc_[ENV] run_prefix=test gpu=0 wandb=true

SkiMo+SAC

  • Downstream RL
python run.py --config-name skimo_[ENV] run_prefix=sac gpu=0 wandb=true rolf.phase=rl rolf.use_cem=false rolf.n_skill=1 rolf.prior_reg_critic=true rolf.sac=true rolf.pretrain_ckpt_path=[PRETRAINED_CKPT]

SkiMo w/o joint training

  • Pre-training
python run.py --config-name skimo_[ENV] run_prefix=no_joint gpu=0 wandb=true rolf.joint_training=false
  • Downstream RL
python run.py --config-name skimo_[ENV] run_prefix=no_joint gpu=0 wandb=true rolf.joint_training=false rolf.phase=rl rolf.pretrain_ckpt_path=[PRETRAINED_CKPT]

Troubleshooting

Failed building wheel for mpi4py

Solution: install mpi4py with conda instead, which requires a lower version of python.

conda install python==3.8
conda install mpi4py

Now you can re-run sh install.sh.

MacOS mujoco-py compilation error

See this. In my case, I needed to change /usr/local/ to /opt/homebrew/ in all paths.

Citation

If you find our code useful for your research, please cite:

@inproceedings{shi2022skimo,
  title={Skill-based Model-based Reinforcement Learning},
  author={Lucy Xiaoyang Shi and Joseph J. Lim and Youngwoon Lee},
  booktitle={Conference on Robot Learning},
  year={2022}
}

References