bstars/RLExamples

Examples of common reinforcement learning algorithms

Python

This repository contains implementations of common reinforcement learning algorithms including

Q Learning for wind-blow example (see page 25 of David Silver's Slide)

(Double) DQN for cart-pole example
(Actor-critic) Policy gradient for cart-pole example
Proximal Policy Optimization (with Clipped Surrogate) for BipedalWalker example

Modern version of Soft Actor-Critic (according to Soft Actor-Critic Algorithms and Applications) for walker example

Standard version of Soft Actor-Critic (according to Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor) for walker example

Diversity is All You Need (according to Diversity is All You Need: Learning Skills without a Reward Function) for BipedalWalker example

Skill 8	Skill 16	Skill 18