Deep Reinforcement Learning study (1) REINFORCE (2) Vanilla Actor Critic (3) Advantage Actor Critic (4) Deep Q-learning (5) Deep Deterministic Policy Gradient (6) Proximal Policy Optimization(discrete/continuos) (7) Soft Actor Critic(discrete/continuos)