My solutions to the Practical Reinforcement Learning course by Coursera and the Higher School of Economics by the National Research University, which is part 4 out of 7 by the Advanced Machine Learning Specialization.
Assignments can be found inside each week's folders and they're displayed in commented Jupyter notebooks along with quizzes.
- Week 1:
- OpenAI Gym Environment
- MDPs
- Cross-entropy method
- Evolution strategies
- Week 2:
- Rewards
- Bellman equations
- Policy/Value iteration
- Week 3:
- Model-free learning
- Q-learning
- Exploration vs exploitation
- Monte-Carlo vs Temporal Difference
- on-policy vs off-policy
- Experience Replay
- Week 4:
- Approximate Value Based Methods
- Loss functions
- CartPole
- Deep Q-Learning (DQN)
- Week 5:
- Policy-based methods
- Policy-based RL vs Value-based RL
- REINFORCE
- Actor-critic
- A3C
- Week 6:
- Exploration
- Measuring
- Uncertainty-based exploration
- Bandits
- Monte Carlo Tree Search
- seq2seq