
some reinforcement learning code.

Primary LanguagePython


Implementation of some reinforcement learning code in TensorFlow 1.14.0. However, all code for continuous space. In the future, it will be used on motion planner.

conda/pip install gym
conda/pip install tensorflow-gpu==1.14.0


  1. ddpg_gym: The DDPG(Deep Deterministic Policy Gradient) for gym-Pendulum.
  2. td3_gym: The TD3(Twin Delayed Deep Deterministic Policy Gradients) for gym-Pendulum.

NOTE: TD3 is not adde with the Target Policy Smoothing, maybe added in the future.

  1. sac+gym: The SAC(Soft Actor Critic) for gym-Pendulum.


  1. Modify code comments.
  2. Add Target Policy Smoothing in td3_gym.
  3. Add evaluePolicy in ddpg_dym and td3_gym.
  4. Modularization, separate the modules such as buffer, actor and critic.
  5. Add IMPALA of SAC with replay buffer.
  6. Try to train planner based on IMPAPA-SAC.