- setup Mujoco
- download the repo
git clone --recurse-submodules git@github.com:cheneyuwu/rlfd
- build virtual env (python >= 3.6, < 3.8)
- cluster:
module load python/3.6
(so you have the correct version) virtualenv venv
(or use conda if preferred)
- cluster:
- enter virtual env and install packages:
- install mujoco_py (>= 2.0.0)
pip install mujoco_py
- install tensorflow (>= 2.2.0)
- local:
pip install tensorflow tensorflow_probability
(or use conda if preferred) - cluster:
pip install tensorflow_gpu tensorflow_probability
- local:
- install pytorch (>= 1.5.0)
pip install torch torchvision
(or use conda if preferred)
- install ray with tune (>= 0.8.0)
pip install ray[tune]
- install environments and rlfd
pip install gym
pip install -e gym_rlfd
pip install -e d4rl
- install rlfd
pip install -e rlfd
- add rlfd environment variables
source setup.sh
- install mujoco_py (>= 2.0.0)
Example launch files will be provided when we settle down our methods. For now, carefully go through rlfd/launch.py
and see how a launch file is parsed.
A launch file should look like this:
# file name: <launch file>.py
from copy import deepcopy
from rlfd.params.sac import gym_mujoco_params # import default parameters for gym_mujoco environments
# the launch file defines a global dict called params_config
params_config = deepcopy(gym_mujoco_params) # get default parameters of an algorithm.
params_config["config"] = ("SAC", ) # provide your exp config name, will be used for plotting.
params_config["env_name"] = "halfcheetah-medium-v0" # which environment
# Make whatever changes to the default parameters.
params_config["seed"] = tuple(range(5)) # a tuple means grid search.
Train locally:
python -m rlfd.launch --targets train:<param name>.py --num_cpus <default to 1> --num_gpus <default to 0>
Train on CC:
python -m rlfd.launch --targets slurm:<param name>.py --num_cpus <default to 1> --num_gpus <default to 0> --memory <per cpu, default to 4GB>
After training, the following files should present in the each experiment directory:
params.json - parameters for the experiment enclosed in this directory
policies - a folder containing intermediate policies and the last one after online/offline training.
summaries - tensorboard summaries
python -m rlfd.launch --targets evaluate --policy <policy file name>.pkl
Use tensorboard
tensorboard --logdir <path to experiment directory> --port <port number>
Use our own plotting scripts, check rlfd/plot.py
for details.
python -m rlfd.launch --targets plot --exp_dir <top level plotting directory>