/population-irl

(Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewards

Primary LanguagePythonMIT LicenseMIT

Watchers