Implementation of some reinforcement learning code in TensorFlow 1.14.0. However, all code for continuous space. In the future, it will be used on motion planner.
conda/pip install gym
conda/pip install tensorflow-gpu==1.14.0
- ddpg_gym: The DDPG(Deep Deterministic Policy Gradient) for gym-Pendulum.
- td3_gym: The TD3(Twin Delayed Deep Deterministic Policy Gradients) for gym-Pendulum.
NOTE: TD3 is not adde with the Target Policy Smoothing, maybe added in the future.
- sac+gym: The SAC(Soft Actor Critic) for gym-Pendulum.
- Modify code comments.
- Add Target Policy Smoothing in td3_gym.
- Add evaluePolicy in ddpg_dym and td3_gym.
- Modularization, separate the modules such as buffer, actor and critic.
- Add IMPALA of SAC with replay buffer.
- Try to train planner based on IMPAPA-SAC.