ErlebnisW/RL-Coursera

Implementations of Coursera Reinforcement Learning Specialization

Jupyter NotebookMIT

RL-Coursera

Implementations of Coursera Reinforcement Learning Specialization.

The structure of this specialization:

1. Fundamentals of Reinforcement Learning

Week 2: Markov Decision Processes

Assignment: K-armed Bandits and Exploration/Exploitation

Week 3: Value Functions & Bellman Equations

No assignment

Week 4: Dynamic Programming

Assignment: Optimal Policies with Dynamic Programming

2. Sample-based Learning Methods

Week 2: Monte Carlo Methods for Prediction & Control

No assignment

Week 3: Temporal Difference Learning Methods for Prediction

Assignment: Policy Evaluation with Temporal Difference Learning

Week 4: Temporal Difference Learning Methods for Control

Assignment: Q-learning and Expected Sarsa

Week 5: Planning, Learning & Actiong

Assignment: Dyna-Q and Dyna-Q+

3. Predictions and Control with Function Approximation

Week 1: On-policy Prediction with Approximation

Assignment: Semi-gradient TD(0) with Stage Aggregation

Week 2: Constructing Features for Prediction

Assignment: Semi-gradient TD with a Neural Network

Week 3: Function Approximation and Control

Assignment: Episodic Sarsa with Function Approximation and Tile-coding

Week 4: Policy Gradient

Assignment: Average Reward Softmax Actor-Critic with Tile-coding

4. A Complete Reinforcement Learning System (Capstone)

Lunar Lander Projects

Assignment: Build the Lunar Lander Agent
Assignment: Parameter Study