This is an AI-controlled game in which the goal is to find the treasure. The player can move up-down-right-left and die when they come into contact with the obstacles. The user can train the AI as much as they want, and the option to show the best action of each state at the current iteration is available.
By selecting learn20 or learn50, the AI will begin generating episodes and learning from them, updating the Q-table.
By clicking policy, each state will show the best actual action.