/CROP

Code for paper "CROP: Conservative Reward for Model-based Offline Policy Optimization".

Primary LanguagePython

Conservative Reward for model-based Offline Policy optimization (CROP)

This is the soucre code of the model-based offline reinforcement learning method Conservative Reward for model-based Offline Policy optimization (CROP).

Installation

  1. Install MuJoCo 2.1.0

  2. Create a conda environment for CROP.

conda env create -f CROP.yml
conda activate CROP

Usage

Configuration files can be found in args/. For example, to run the halfcheetah-medium task from the D4RL benchmark, use the following.

python CROP.py --args-path args/halfcheetah-medium.json