Goal of this project is to created simple environment for using Dynamic Programing methods in OpenAI Gym Frozen Lake environment.
DP methods required full knowledge about the MDP and because of that they couldn't be applied directly to Frozen Lake env.
Environment custom_frozen_lake
allows to put agent in any
state and by estimating rewards and transition probabilities
gives the possibility to apply DP.
Value Iteration and Policy Iteration methods were implemented based on slides from David Silver Reinforcement Course at UCL.
You can run examples for applying simply by running
python run_example.py
in dynamic_programming
directory.
Most objects has comments about usage so feel free to check them.
Have fun!