This is the soucre code of the model-based offline reinforcement learning method Conservative Reward for model-based Offline Policy optimization (CROP).
-
Install MuJoCo 2.1.0
-
Create a conda environment for CROP.
conda env create -f CROP.yml
conda activate CROP
Configuration files can be found in args/
. For example, to run the halfcheetah-medium task from the D4RL benchmark, use the following.
python CROP.py --args-path args/halfcheetah-medium.json