phalonneNana/Reinforcement-Learning-by-Minimizing-Constraint-Violation
This research aims to address the challenge of effectively representing logical safety constraints within the RL framework by introducing a novel violation measure, thereby enhancing the agent’s decision- making process in model-free RL using constrained Markov Decision Processes to adhere to the safety constraints.
Jupyter Notebook