reinforcement-learning-models: A Jupyter Notebook repository from Phrungck

Reinforcement Learning Algorithms (2021 coding-style)

Implemented Q-Learning, SARSA, and Cross Entropy Method using numpy and torch and compared their performance on frozenlake-deterministic, frozenlake-stochastic, and cliffwalking.

Dependencies

OpenAI gym
matplotlib
numpy
collections
torch
itertools
plotting

Deterministic Frozenlake Results

Stochastic Frozenlake Results

Cliffwalking Results

Changing Parameters

All results showed that SARSA and Q-Learning bested Cross-entropy method for the CliffWalking environment. Changes in the hyperparameters showed significant changes. Notably, by increasing the alpha parameter Q-Learning and SARSA exceeded results of the baseline.