/POLICEd-RL

Official Code Repository for the POLICEd-RL Paper: https://arxiv.org/abs/2403.13297

Primary LanguagePythonMIT LicenseMIT

POLICEd-RL: Learning Closed-Loop Robot Control Policies with Provable Satisfaction of Hard Constraints

License: MIT

Overview

Repository containing code to implement POLICEd RL presented at RSS 2024. The objective of POLICEd RL is to guarantee the satisfaction of an affine hard constraint when learning a policy in closed-loop with a black-box deterministic environment. The algorithm enforces a repulsive buffer in front of the constraint preventing trajectories to approach and violate this constraint. To analytically verify constraint satisfaction, the policy is made affine in that repulsive buffer using the POLICE algorithm.

POLICEd RL guarantees that this KUKA robotic arm will never cross the red surface when reaching for the green target thanks to the cyan repulsive buffer.

POLICEd RL learns to reach the target while avoiding the constraint

We provide the code for our implementation of POLICEd RL on several systems:

  • an illustrative 2D system
  • the CartPole
  • the Gymnasium Inverted Pendulum
  • a KUKA robotic arm

We illustrate POLICEd RL on a 2D system tasked with reaching a target location (cyan) without crossing a constraint line (red). In the repulsive buffer (green) the policy is affine and learns to point away from the constraint.

POLICEd RL learns to reach the target while avoiding the constraint

Organization

  • POLICEdRL contains the project source code,
  • docs contains the code for our website.

Credit

The following repositories have been instrumental from both an algorithm and software architecture perspective in the development of this project: