Conservative Reward for model-based Offline Policy optimization (CROP)

This is the soucre code of the model-based offline reinforcement learning method Conservative Reward for model-based Offline Policy optimization (CROP).

Installation

Install MuJoCo 2.1.0
Create a conda environment for CROP.

conda env create -f CROP.yml
conda activate CROP

Usage

Configuration files can be found in args/. For example, to run the halfcheetah-medium task from the D4RL benchmark, use the following.

python CROP.py --args-path args/halfcheetah-medium.json

G0K0URURI/CROP

Conservative Reward for model-based Offline Policy optimization (CROP)

Installation

Usage