
In the "final" and "new" versions of the "maze solver", Model-Based Bayesian Reinforcement Learning is implemented. You may use matrixPlot.m file to plot the trajectory of the agent. Firstly, the Maze MDP is formulated, and the objective is to reach the final element of the maze as well as pick up the flags through the trajectory. I used Dirichlet Distribution as the Conjugate Prior, and Belief Monitoring using Bayes Rule. I also Parameterized the transition dynamics for inference and updating the posterior.