Safe Multi-Agent Reinforcement Learning to Make decisions in Autonomous Driving
Jupyter NotebookApache-2.0
Constrained Stackelberg Q-learning and MADDPG
This is a pytorch implementation of Constrained Stackelberg Q-learning(discrete action) and Constrained Stackelberg MADDPG(continuous action). These algorithms are proposed by incorporating the Stackelberg model into Deep Q-learning and MADDPG, and leveraging the Lagrangian multiplier method to deal with the safety constraints. The highway environments used in our experiments are modified from highway-env.