Reinforcement Learning

This repository provides a collection of both theoretical problems as homeworks and python implementations as labs.

Lab 1

The minotaur maze

The agent tries to find the exit of the maze while escaping a minotaur following a random walk. We provide the problem formulation as an MDP and its solution.

Robbing banks

The agent tries to maximize its reward by staying as much as possible inside a bank but avoiding being captured by the police. We provide the problem formulation as an MDP and its solution.

Mountain Car with linear function approximators

We solve the OpenAI Mountain Car problem using linear function approximators. We provide the solution and a set of trained weights that solve the problem.

Lab 2

This lab focuses on the OpenAI Lunar Lander problem for both the discrete and continuous action spaces.

Deep Q-Networks (DQN)

We implement the DQN algorithm with some modifications (Dueling DQN and combined experience replay buffer) and train it to solve the problem.

Deep Deterministic Policy Gradient (DDPG)

We implement the DDPG algorithm, train our model and solve the problem.

Proximal Policy Optimization (PPO)

We implement the PPO algorithm, train our model and solve the problem.

DamianValle/RL2020