/deep_deterministic_policy_gradient

DDPG implementation. Tested with cheetah in Mujoco.

Primary LanguageTeX

Deep deterministic Policy Gradient on HalfCheetah-v2

Dependencies

  • tensorflow
  • gym
  • mujocopy

Run!

Simply type on the terminal python main.py --mode train/test.

Results

After ~ 18000 episodes the mean reward converges to 2700.

cheetah_rew1 cheetah_rew2