Some-RL-Implementation (more to be added)

  1. Policy gradients in TensorFlow for CartPole environment in OpenAI gym
  2. Partial implementation of the paper Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning (
  3. Implementation of the bit-flipping example in the paper Hindsight Experience Replay (
  4. Proximal Policy Optimization (PPO) Algorithms
  5. Conservative Q-Learning for Offline Reinforcement Learning ( and Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning (