
Contains solution to assignment problems of the course "Reinforcement Learning" offered at IIT Dharwad

Primary LanguageJupyter Notebook

Reinforcement Learning

Contains solution to assignment problems of the course "CS404 : Reinforcement Learning" offered by Dr. Prabuchandran K J at IIT Dharwad.

  • Assignment 2 : Bandit Algorithms

    • Implemented epsilon-greedy, variable epsilon-greedy, Softmax, Upper Confidence Bound (UCB) and Thompson sampling algorithms for Bernoulli and Normal reward setting
  • Assignment 3 : Value Based Methods

    • A classical maze problem was considered and policy iteration and value iteration were used to solve the problem.