Some-RL-Implementation (more to be added)

Policy gradients in TensorFlow for CartPole environment in OpenAI gym
Partial implementation of the paper Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning (https://arxiv.org/pdf/1708.02596.pdf)
Implementation of the bit-flipping example in the paper Hindsight Experience Replay (https://arxiv.org/pdf/1707.01495.pdf)
Proximal Policy Optimization (PPO) Algorithms

litianyang2017/Some-RL-Implementation