/reinforcement-learning-models

Simple implementation and comparison of three reinforcement learning models.

Primary LanguageJupyter Notebook

Reinforcement Learning Algorithms (2021 coding-style)

Implemented Q-Learning, SARSA, and Cross Entropy Method using numpy and torch and compared their performance on frozenlake-deterministic, frozenlake-stochastic, and cliffwalking.

Dependencies

  • OpenAI gym
  • matplotlib
  • numpy
  • collections
  • torch
  • itertools
  • plotting

Deterministic Frozenlake Results

alt text

Stochastic Frozenlake Results

alt text

Cliffwalking Results

alt text

Changing Parameters

alt text

All results showed that SARSA and Q-Learning bested Cross-entropy method for the CliffWalking environment. Changes in the hyperparameters showed significant changes. Notably, by increasing the alpha parameter Q-Learning and SARSA exceeded results of the baseline.

Increase in alpha while reducing Gamma resulted to almost similar values for all variants of Q-Learning and SARSA. However, Cross-entropy became more erratic in the process.