ModelBasedFootstepPlanning-IROS2024

This repository is an open-sourced code for the following paper presented at the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

Title: Integrating Model-Based Footstep Planning with Model-Free Reinforcement Learning for Dynamic Legged Locomotion

Paper Link: https://arxiv.org/abs/2408.02662
Video Link: https://youtu.be/Z0E9AKt6RFo

Installation

Create a new python virtual env with python 3.8 using Anaconda
Clone this repo
Install humanoidGym Requirements:

pip install -r requirements.txt

Install Isaac Gym
- Download and install Isaac Gym Preview 4 (Preview 3 should still work) from https://developer.nvidia.com/isaac-gym
  - Extract the zip package
  - Copy the isaacgym folder, and place it in a new location
- Install issacgym/python requirements
```
cd <issacgym_location>/python
pip install -e .
```
Install humanoidGym
- go back to the humanoidGym repo, and install it.
```
pip install -e .
```

User Manual

1. Linear Inverted Pendulum Model (LIPM) Animation

All the LIP model-related code is in the LIPM folder. These codes are modified from BipedalWalkingRobots for the Center of Mass (CoM) velocity tracking task.

By running the code below, you should be able to get the following videos:

python LIPM/demo_LIPM_3D_vt.py

By running the code below, you should be able to get the following videos and images:

python LIPM/demo_LIPM_3D_vt_analysis.py

2. MIT Humanoid in IsaacGym

Train

python gym/scripts/train.py --task=humanoid_controller

To run on CPU add the following arguments: --sim_device=cpu, --rl_device=cpu (sim on CPU and rl on GPU is possible).
To run headless (no rendering) add --headless.
Important: To improve performance, once the training starts press v to stop the rendering. You can then enable it later to check the progress.
The trained policy is saved in gym/logs/<experiment_name>/<date_time>_<run_name>/model_<iteration>.pt. Where <experiment_name> and <run_name> are defined in the train config.
The following command line arguments override the values set in the config files:
--task TASK: Task name.
--resume: Resume training from a checkpoint
--experiment_name EXPERIMENT_NAME: Name of the experiment to run or load.
--run_name RUN_NAME: Name of the run.
--load_run LOAD_RUN: Name of the run to load when resume=True. If -1: will load the last run.
--checkpoint CHECKPOINT: Saved model checkpoint number. If -1: will load the last checkpoint.
--num_envs NUM_ENVS: Number of environments to create.
--seed SEED: Random seed.
--max_iterations MAX_ITERATIONS: Maximum number of training iterations.
--original_cfg: Use configs stored in the saved files associated with the loaded policy instead of the current one in envs.

Play (a trained policy)

python gym/scripts/play.py --task=humanoid_controller

By default the loaded policy is the last model of the last run of the experiment folder.
Other runs/model iteration can be selected by setting --load_run and --checkpoint.
You would need around 3,000 iterations of training to obtain a well-behaved policy.

3. Deploy the policy to robot hardware

This repository does not include a code stack for deploying a policy to MIT Humanoid hardware. Please check the Cheetah-Software for our lab's hardware code stack.

To deploy the trained policy, you would need to set EXPORT_POLICY=TRUE in the humanoidGym/scripts/play.py script. Then you would get a policy.onnx file to run on C++ code.

Troubleshooting

If you get the following error: ImportError: libpython3.8m.so.1.0: cannot open shared object file: No such file or directory, do: export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:\$LD_LIBRARY_PATH

Acknowledgement

We would appreciate it if you would cite it in academic publications:

@article{lee2024integrating,
  title={Integrating Model-Based Footstep Planning with Model-Free Reinforcement Learning for Dynamic Legged Locomotion},
  author={Lee, Ho Jae and Hong, Seungwoo and Kim, Sangbae},
  journal={arXiv preprint arXiv:2408.02662},
  year={2024}
}

hojae-io/ModelBasedFootstepPlanning-IROS2024