Celina Hanouti & Imad Sidhoum (Equal Contribution)
This repository contains practical work of the course "Reinforcement Learning and Advanced Deep Learning" at Sorbonne Université, 2020-2021.
- TME 1 : Bandit algorithms (stochastic bandits, contextual bandits, ...)
- TME 2 : Planning via Dynamic Programming (Value Iteration, Policy Iteration)
- TME 3 : Value-Based methods (TD-lambda, Q learning, ...)
- TME 4 : Deep value-based methodes (DQN, Prioritized Experience Replay,..)
- TME 5 : Actor-Critic (A2C)
- TME 6-7 : Advanced Actor-Critic (PPO)
- TME8 : DDPG
- TME 9 : Generative adversarial networks (GANs)
- TME 10 : Variational auto-encoder (VAE)
- TME 11 : Multi-agents DDPG
- TME 12-13 : Imitation Learning (GAIL)
- TME 14 : Curriculum Learning