
Implementation of Q learning algorithm. Used to find the exit of a proposed house. Agent is placed in a room and using the algorithm finds the exit of the house.

Rooms are set up as nodes with the room leading to the exit having a higher score than others. This reward score is backtracked to the inital point and an escape route is found.

alt tag