DPain/q-learning-practice

Practice for implementing Q Learning to find the shortest distance from A to B.

Python

q-learning-practice

Practice for implementing Q Learning to find the shortest distance from A to B.

Example Diagram

Circle 1 is where the starting agent is and Circle 0 is the target. The arrows indicate how the agent can move between the circles.

Result

Answer: Going via 1 -> 3-> 7 -> 0 is the shortest route.