/reinforcement-learning

Ongoing self-learning of reinforcement learning.

Primary LanguageJupyter Notebook

Reinforcement Learning

Overview

This is an repository of reinforcement learning that I'm currently working on.

Result

  • DQN in Lunar Lander discrete action space

video_06

image_05

  • Dueling architecture Double Q Learning with prioritized experience replay after training 4,500 episodes.

video_01

  • Proximal policy optimization in Atari Pong

video_07

image_06

  • Lunar lander continuous environment by using Deep Deterministic Policy Gradient. (Fail)

video_05

image_04

  • Deep Deterministic Policy Gradient (DDPG) in Pendulum environment and moving average reward

video_02

image_02

  • Asynchronous Advantage Actor Critic in CartPole.

video_03

image_03

  • Actor Critic in Mountain Car

video_04

image_01

Algorithm

  • Deep Q Network (DQN)
  • Double learning
  • Dueling architecture
  • Proximal policy optimization (PPO)
  • Asynchronous Advantage Actor Critic (A3C)
  • Prioritized experience replay (PER)
  • Deep Deterministic Policy Gradient (DDPG)

Environment

  • OpenAI Gym
    • Space invaders
    • Lunar lander discrete/continous
    • Pong
    • Cartpole
    • Mountain car

Tool

  • OS: Ubuntu 20.04
  • GPU: NVIDIA GeForce RTX 2070
  • Laptop: System76 Oryx Pro
  • Neural network: TensorFlow and PyTorch
  • Google Colab
  • AWS EC2 Ubuntu 18.04 g4dn.xlarge 1GPU
  • AWS Deep Learning AMI 1GPU

Studying