/MDP

Primary LanguagePython

MDP (MARKOV DECISION PROCESS)

Problem Statement

Consider the 3x3 world shown in the following figure:

image
The agent has four actions Up, Down, Right and Left. The transition model is: 80% of the time the agent goes in the direction it selects; the rest of the time it moves at right angles to the intended direction. A collision with a wall results in no movement.

image