HumanCompatibleAI/population-irl
(Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewards
PythonMIT
(Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewards
PythonMIT