Reinforcement Learning Cheat Sheet

Some important concepts and algorithms in RL, all summarized in one place. PDF file is also available here.

Bandits: settings, exploration-exploitation, UCB, Thompson Sampling
RL Framework: Markov Decision Process, Markov Property, Bellman Equations
Dynamic Programming: Policy Evaluation, Policy Iteration, Value Iteration
Value-Based
1. Tabular environments: Tabular Q-learning, SARSA, TD-learning, eligibility traces
2. Approximate Q-learning: DQN, prioritized experience replay, Double DQN, Rainbow, DRQN
Policy Gradients
1. On-Policy: REINFORCE, Actor-Critic (with compatible functions, GAE), A2C/A3C, TRPO, PPO
2. Off-Policy: Policy gradient theorem, ACER, importance sampling
3. Continuous Action Spaces: DDPG, Q-Prop

References

Reinforcement Learning and advanced Deep Learning (RLD), Sorbonne University course, by Sylvain Lamprier
Spinning Up in Deep RL, Open AI
UCL Course on RL, David Silver's Lecture

Contributing

Contributions are welcome ! If you find any typo or error, feel free to raise an issue.

If you would like to contribute to the code and make changes directly (e.g. adding algorithms, adding a new section, etc), you should start by cloning the repository.

git clone https://github.com/alxthm/rl-cheatsheet.git

Work locally

Since all the sources and figures are included in the repo, you can make modifications and build the document locally. For this, you should have a full TeX distribution (if not, you can install it here), and you can then edit the LateX files with any IDE (e.g. Visual Studio Code).

Work on Overleaf

If you'd rather avoid installing LateX, you can also use Overleaf. For this, you need to compress the rl-cheatsheet folder and upload it to Overleaf (New Project -> Upload Project).

alxthm/rl-cheatsheet

Reinforcement Learning Cheat Sheet

Contents

References

Contributing

Work locally

Work on Overleaf