Solving the Markov Decision Process model of the gym's FrozenLage-v0 environment using value iteration and policy iteration. Week 2 homework for the #move37
course. Decided to use this environment because it worked well with Jupyter notebooks without doing anything extra. And I decided to use a Jupyter notebook because I didn't want to just code the algorithm but also put some notes.
You can see the resulting notebook at notebooks/DynamicProgramming.ipynb
To run the Jupyter server via docker: docker-compose up
.