/rosmo

Codes for "Efficient Offline Policy Optimization with a Learned Model", ICLR2023

Primary LanguagePythonApache License 2.0Apache-2.0

ROSMO


Check status License Arxiv

Table of Contents

Introduction

This repository contains the implementation of ROSMO, a Regularized One-Step Model-based algorithm for Offline-RL, introduced in our paper "Efficient Offline Policy Optimization with a Learned Model". We provide the training codes for both Atari and BSuite experiments, and have made the reproduced results on Atari MsPacman publicly available at W&B.

Installation

Please follow the installation guide.

Usage

BSuite

To run the BSuite experiments, please ensure you have downloaded the datasets and placed them at the directory defined by CONFIG.data_dir in experiment/bsuite/config.py.

  1. Debug run.
python experiment/bsuite/main.py -exp_id test -env cartpole
  1. Enable W&B logger and start training.
python experiment/bsuite/main.py -exp_id test -env cartpole -nodebug -use_wb -user ${WB_USER}

Atari

The following commands are examples to train 1) a ROSMO agent, 2) its sampling variant, and 3) a MZU agent on the game MsPacman.

  1. Train ROSMO with exact policy target.
python experiment/atari/main.py -exp_id rosmo -env MsPacman -nodebug -use_wb -user ${WB_USER}
  1. Train ROSMO with sampled policy target (N=4).
python experiment/atari/main.py -exp_id rosmo-sample-4 -sampling -env MsPacman -nodebug -use_wb -user ${WB_USER}
  1. Train MuZero unplugged for benchmark (N=20).
python experiment/atari/main.py -exp_id mzu-sample-20 -algo mzu -num_simulations 20 -env MsPacman -nodebug -use_wb -user ${WB_USER}

Citation

If you find this work useful for your research, please consider citing

@inproceedings{
  liu2023rosmo,
  title={Efficient Offline Policy Optimization with a Learned Model},
  author={Zichen Liu and Siyi Li and Wee Sun Lee and Shuicheng Yan and Zhongwen Xu},
  booktitle={International Conference on Learning Representations},
  year={2023},
  url={https://arxiv.org/abs/2210.05980}
}

License

ROSMO is distributed under the terms of the Apache2 license.

Acknowledgement

We thank the following projects which provide great references:

Disclaimer

This is not an official Sea Limited or Garena Online Private Limited product.