Louiii/Markov-Decision-Processes
Policy improvement algorithm to solve Markov decision processes. This is a MDP because the probabilities and rewards are known, if these are unknown this is a reinforcement learning problem, you would estimate the probabilities and rewards by sampling and storing in an array Q (Q-learning).
Jupyter Notebook
No issues in this repository yet.