Reinforcement-Learning

Implementation of some reinforcement learning code in TensorFlow 1.14.0. However, all code for continuous space. In the future, it will be used on motion planner.

conda/pip install gym
conda/pip install tensorflow-gpu==1.14.0

Description

ddpg_gym: The DDPG(Deep Deterministic Policy Gradient) for gym-Pendulum.
td3_gym: The TD3(Twin Delayed Deep Deterministic Policy Gradients) for gym-Pendulum.

NOTE: TD3 is not adde with the Target Policy Smoothing, maybe added in the future.

sac+gym: The SAC(Soft Actor Critic) for gym-Pendulum.

TODO

Modify code comments.
Add Target Policy Smoothing in td3_gym.
Add evaluePolicy in ddpg_dym and td3_gym.
Modularization, separate the modules such as buffer, actor and critic.
Add IMPALA of SAC with replay buffer.
Try to train planner based on IMPAPA-SAC.

LTianyyi/Reinforcement-Learning

Reinforcement-Learning

Description

TODO