This repository provides a collection of both theoretical problems as homeworks and python implementations as labs.
The agent tries to find the exit of the maze while escaping a minotaur following a random walk. We provide the problem formulation as an MDP and its solution.
The agent tries to maximize its reward by staying as much as possible inside a bank but avoiding being captured by the police. We provide the problem formulation as an MDP and its solution.
We solve the OpenAI Mountain Car problem using linear function approximators. We provide the solution and a set of trained weights that solve the problem.
This lab focuses on the OpenAI Lunar Lander problem for both the discrete and continuous action spaces.
We implement the DQN algorithm with some modifications (Dueling DQN and combined experience replay buffer) and train it to solve the problem.
We implement the DDPG algorithm, train our model and solve the problem.
We implement the PPO algorithm, train our model and solve the problem.