/RL

A set of RL experiments. Currently including: (1) the MDP rank experiment, based on policy gradient algorithm

Primary LanguagePython

Watchers