A Matlab implementation of the Q-Learning Algorithm with known number of states. The problem chosen to solve was the apartment problem where the states are given according to the following graph.
http://mnemstudio.org/ai/path/images/map1a.gifThe agent is allowed to visit the states according to the available transitions in the graph. Any transition that is not available is given as a 0 in the action and -1 in the initial R matrix.
The Algorithm