CoVO-MPC: CoVariance-Optimal MPC

[Website] [PDF] [Arxiv]

Simulator	Hardware (Crazyflie)

Implementation of the novel sampling-based Model Predictive Control (MPC) algorithm, CoVariance-Optimal MPC (CoVO-MPC), developed through the research outlined in the associated paper. This new control algorithm is sought to outperform standard Model Predictive Path Integral Control (MPPI) by 43 to 54%.

Why CoVO-MPC?

Sampling-based MPC becomes prevalent in motion planning and model-based RL for its flexibility and parallizability. However, there is no convergence analysis to it, which leads to tune hyperparameters heuristically. For instance, MPPI use dynamic-agnostic isotropic Gaussian to sample trajectories, which leads to sub-optimal performance.

We first proves MPPI’s convergence and provide insights into optimal sampling covariance design. CoVO-MPC is proposed to optimize sampling covariance according to the optimization landscape to achieve optimal sampling distribution given certain dynamics.

MPPI	CoVO-MPC

Installation

conda create -n jax python
conda activate jax
pip install -e .

Run `CoVO-MPC`

Run CoVO-MPC in the quadrotor environment (if you want to run other controllers, just replace covo-online with pid, mppi, etc. You can also change the task to tracking or hover):

cd quadjax/envs
# note: --noDR means no domain randomization, disable it if running a controller
python quadrotor.py --controller covo-online --task tracking_zigzag --mode render --disturb_type none --noDR 
# run CoVO-MPC offline approximation version
python quadrotor.py --controller covo-offline --task tracking_zigzag --mode render --disturb_type none --noDR

This will generate tested state sequence state_seq.pkl and plot figure plot.png.

Reproduce the results in the CoVO-MPC paper:

cd quadjax/scripts
# main results
sh covo_quadrotor.sh
# ablation study for sampling number
sh covo_quadrotor_N.sh

Visualization

cd quadjax/scripts
python vis.py

This will visualize the results in quadjax/results/state_seq_.pkl with meshcat, which is generated by quadjax/envs/quadrotor.py.

Notes

all action in the environment is normalized into [-1, 1] range.
for easier modification, the environment is designed to be a simple single-file format.
The repo also supports PPO, RMA, DATT, L1 adaptive control, etc. You can find them in the branch rl.

Citation

If you find this repo useful for your research, please consider citing our paper:

@misc{yi2024covompc,
      title={CoVO-MPC: Theoretical Analysis of Sampling-based MPC and Optimal Covariance Design}, 
      author={Zeji Yi and Chaoyi Pan and Guanqi He and Guannan Qu and Guanya Shi},
      year={2024},
      eprint={2401.07369},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

lukechencqu/CoVO-MPC