Setup

Install mujoco
Install mujoco-py
Clone rstrudel/bcmuj and do pip install -e .
Clone rstrudel/bc and do pip install -e .

Commands

python online_train.py [--method METHOD] [--resume RESUME]

METHOD should be bc, dagger or dart
RESUME can specify an epoch identifier to resume training from (saved models are stored in ./storage/models/[METHOD]/)

python eval.py [METHOD] [EPOCH] [--render] [--eps EPS] [--all ALL]

METHOD should be %expert, bc, dagger or dart
EPOCH should be the identifier of the epoch to evaluate (irrelevant for %expert)
--render can be specified to render the environment
EPS can specify a number of episodes to run (default is 1000)
ALL can specify to evaluate all epochs at the specified interval until EPOCH (not compatible with --render)

Results

Our results can be found in this notebook.

To reproduce, run:

python online_train.py --method bc ;
python online_train.py --method dagger ;
python online_train.py --method dart ;
python eval.py bc 6144 --eps 500 --all 128 ;
python eval.py dagger 6144 --eps 500 --all 128 ;
python eval.py dart 6144 --eps 500 --all 128

Then just execute the notebook.

A GPU with a lot of memory is required to run this. It should take about 48 hours to train and evaluate.

lcswillems/MVA-RecVis-project

Setup

Commands

Results