Benchmark code from the following paper:
R. Hoque, L.Y. Chen, S. Sharma, K. Dharmarajan, B. Thananjeyan, P. Abbeel, K. Goldberg. Fleet-DAgger: Interactive Robot Fleet Learning with Scalable Human Supervision. Conference on Robot Learning (CoRL), 2022.
First install Python dependencies in a virtual environment by running . install.sh
.
To run the IFLB you will need to install Isaac Gym. Download Isaac Gym 1.0rc3 from https://developer.nvidia.com/isaac-gym (you may need to send a request but it should be quickly approved) and read the installation instructions in the docs to pip install into the virtual environment. You will need NVIDIA driver version >= 470.
Then clone NVIDIA IsaacGymEnvs from https://github.com/NVIDIA-Omniverse/IsaacGymEnvs and pip install it into the virtual environment.
We provide demos (and expert model checkpoints) for you already. If desired, you can re-generate task demos by running
python -m main @scripts/args_isaacgym_demos.txt --env_name [ENV_NAME]
followed by
python scripts/extract_demos.py [RAW_DATA_FILE]
where RAW_DATA_FILE
is raw_data.pkl
in the log directory generated by the experiment above.
After moving the generated task demo file to the appropriate location in env/assets/isaacgym/demos/task/[ENV_NAME].pkl
, you can generate constraint demos by running
python -m main @scripts/args_isaacgym_constraints.txt --env_name [ENV_NAME]
followed by
python scripts/extract_constraints.py [RAW_DATA_FILE]
and moving the output pickle file to the correct location in env/assets/isaacgym/demos/constraint
.
Run the following scripts from this directory to run the IFL algorithms from the paper, or make your own scripts modeled after these.
. scripts/CUR.sh
. scripts/ensembledagger.sh
. scripts/thriftydagger.sh
. scripts/random.sh
. scripts/BC.sh
All experiment logs should be saved in logs/
(or wherever you set the logdir
to be). You can organize this into subfolders, e.g., with mkdir humanoid && mv *Humanoid_* humanoid/
. Then you can run python plotting/plot.py logs/humanoid [KEY]
, where KEY
is cumulative_successes
, cumulative_viols
(hard resets), cumulative_idle_time
, cumulative_successes cumulative_human_actions
(ROHE), or something else.