Implementation of the novel sampling-based Model Predictive Control (MPC) algorithm, CoVariance-Optimal MPC (CoVO-MPC), developed through the research outlined in the associated paper. This new control algorithm is sought to outperform standard Model Predictive Path Integral Control (MPPI) by 43 to 54%.
Sampling-based MPC becomes prevalent in motion planning and model-based RL for its flexibility and parallizability. However, there is no convergence analysis to it, which leads to tune hyperparameters heuristically. For instance, MPPI use dynamic-agnostic isotropic Gaussian to sample trajectories, which leads to sub-optimal performance.
We first proves MPPI’s convergence and provide insights into optimal sampling covariance design. CoVO-MPC is proposed to optimize sampling covariance according to the optimization landscape to achieve optimal sampling distribution given certain dynamics.
MPPI | CoVO-MPC |
---|---|
conda create -n jax python
conda activate jax
pip install -e .
Run CoVO-MPC
in the quadrotor environment (if you want to run other controllers, just replace covo-online
with pid
, mppi
, etc. You can also change the task to tracking
or hover
):
cd quadjax/envs
# note: --noDR means no domain randomization, disable it if running a controller
python quadrotor.py --controller covo-online --task tracking_zigzag --mode render --disturb_type none --noDR
# run CoVO-MPC offline approximation version
python quadrotor.py --controller covo-offline --task tracking_zigzag --mode render --disturb_type none --noDR
This will generate tested state sequence state_seq.pkl
and plot figure plot.png
.
Reproduce the results in the CoVO-MPC
paper:
cd quadjax/scripts
# main results
sh covo_quadrotor.sh
# ablation study for sampling number
sh covo_quadrotor_N.sh
cd quadjax/scripts
python vis.py
This will visualize the results in quadjax/results/state_seq_.pkl
with meshcat, which is generated by quadjax/envs/quadrotor.py
.
- all action in the environment is normalized into [-1, 1] range.
- for easier modification, the environment is designed to be a simple single-file format.
- The repo also supports PPO, RMA, DATT, L1 adaptive control, etc. You can find them in the branch
rl
.
If you find this repo useful for your research, please consider citing our paper:
@misc{yi2024covompc,
title={CoVO-MPC: Theoretical Analysis of Sampling-based MPC and Optimal Covariance Design},
author={Zeji Yi and Chaoyi Pan and Guanqi He and Guannan Qu and Guanya Shi},
year={2024},
eprint={2401.07369},
archivePrefix={arXiv},
primaryClass={cs.LG}
}