applying reinforcement learning algorithms (q learning , sarsa , deep q learning) to gym environments:
in this work i have applied q learning,sarsa to the Taxi environment of gym and deep q learning the Acrobot results :
the results of the training are the following :
we can see that the reward in the first eposide is close to -2000 but after training for more episodes we notice that reward is growing more and more .