/mpc-reinforcement-learning

Reinforcement Learning with Model Predictive Control

Primary LanguagePythonMIT LicenseMIT

Reinforcement Learning with Model Predictive Control

mpcrl is a library for training model-based Reinforcement Learning (RL) agents with Model Predictive Control (MPC) as function approximation. This framework, also known as MPC-based RL, was first proposed in [1] and has so far been shown effective in various applications and with different learning algorithms, e.g., [2,3].

PyPI version Source Code License Python 3.9

Tests Downloads Maintainability Test Coverage Code style: black


Introduction

This framework merges two powerful control techinques into a single data-driven one

  • MPC, a well-known control methodology that exploits a prediction model to predict the future behaviour of the environment and compute the optimal action

  • and RL, a Machine Learning paradigm that showed many successes in recent years (with games such as chess, Go, etc.) and is highly adaptable to unknown and complex-to-model environments.

mpcrl-diagram

The figure shows the main idea behind this learning-based control approach. The MPC controller, parametrized in $\vartheta$, acts both as policy provider (providing an action to the environment, given the current state) and as function approximation for the state and action value functions. Concurrently, an RL agent is employed to tune the parameters of the MPC in such a way to increase the controller's performance and achieve an (sub)optimal policy.


Installation

To install the package, run

pip install mpcrl

mpcrl has the following dependencies

For playing around with the source code instead, run

git clone https://github.com/FilippoAiraldi/mpc-reinforcement-learning.git

Examples

Our examples subdirectory contains an example application on a small linear time-invariant (LTI) system, tackled both with Q-learning and Deterministic Policy Gradient (DPG).


License

The repository is provided under the MIT License. See the LICENSE file included with this repository.


Author

Filippo Airaldi, PhD Candidate [f.airaldi@tudelft.nl | filippoairaldi@gmail.com]

Delft Center for Systems and Control in Delft University of Technology

Copyright (c) 2023 Filippo Airaldi.

Copyright notice: Technische Universiteit Delft hereby disclaims all copyright interest in the program “mpcrl” (Reinforcement Learning with Model Predictive Control) written by the Author(s). Prof. Dr. Ir. Fred van Keulen, Dean of 3mE.


References

[1] S. Gros and M. Zanon, "Data-Driven Economic NMPC Using Reinforcement Learning," in IEEE Transactions on Automatic Control, vol. 65, no. 2, pp. 636-648, Feb. 2020, doi: 10.1109/TAC.2019.2913768.

[2] H. N. Esfahani, A. B. Kordabad and S. Gros, "Approximate Robust NMPC using Reinforcement Learning," 2021 European Control Conference (ECC), 2021, pp. 132-137, doi: 10.23919/ECC54610.2021.9655129.

[3] W. Cai, A. B. Kordabad, H. N. Esfahani, A. M. Lekkas and S. Gros, "MPC-based Reinforcement Learning for a Simplified Freight Mission of Autonomous Surface Vehicles," 2021 60th IEEE Conference on Decision and Control (CDC), 2021, pp. 2990-2995, doi: 10.1109/CDC45484.2021.9683750.