QLearning-and-Sarsa-for-Cliff-Walking

Reinforcement learning project.

Environment

The environment is Cliff Walking, the detailed information can be read in [A3.pdf].

Result

The experiment shows that Sarsa method tends to choose a safer path while Q-learning tends to choose the optimal path.

How to run?

Just run Qlearning.py or Sarsa.py. And you can get plotted figure if you modify python file a bit.