Reinforcement Learning (RL) code for the paper "OPT-Mimic: Imitation of Optimized Trajectories for Dynamic Quadruped Behaviors". This repository was modified from the raisimLib repository. Modifications include the addition of a URDF file for the ODRI Solo 8 robot to rsc/solo8_v7/
, and the addition of RL environment and training code to raisimGymTorch/raisimGymTorch/env/envs/solo8_env/
.
The installation steps from Raisim should be followed first. Then this repo can be cloned into your $WORKSPACE
directory.
- After any changes to C++ code, go to
raisimGymTorch/
and runpython setup.py develop
- Go to
raisimGymTorch/raisimGymTorch/env/envs/solo8_env/
and runpython vec_ppo.py -n my-experiment-name
to start RL training - Files indicating training progress will be saved to
raisimLibSolo/raisimGymTorch/raisimGymTorch/env/envs/solo8_env/stats/my-experiment-name/
- The reference motion to track can be specified in the
ref_filename
argument ofraisimLibSolo/raisimGymTorch/raisimGymTorch/env/envs/solo8_env/cfg.yaml
. this should correspond to a filename inraisimGymTorch/raisimGymTorch/env/envs/solo8_env/traj/
, which includes reference motion csv files produced from trajectory optimization - Note that Raisim comes bundled with RL training code building off of OpenAI Stable Baselines, but this is unused and instead a custom implementation of RL training code is used here
- Go to
raisimGymTorch/raisimGymTorch/env/envs/solo8_env/
and runpython test_policy.py my-experiment-name/latest.pt
to run the latest policy trained usingpython vec_ppo.py -n my-experiment-name
- Early termination, which is important during RL training, can be turned off during testing by setting
cfg['environment']['disable_termination']
to true inraisimGymTorch/raisimGymTorch/env/envs/solo8_env/test_policy.py
(line 40 as of writing).