Reward Learning by Simulating the Past

This is the code accompanying the paper "Preferences Implicit in the State of the World". Paper, blog post, poster.

Tests can be run with python setup.py test.

Instructions for running the experiments can be found in experiments.sh. The script experiments-for-plots.sh generates the plots from the paper.

HumanCompatibleAI/rlsp