Reproduction-of-Continuous-Control-with-Deep-Reinforcement-Learning-

This project is the reproduction of DDPG.

Run DDPG

conda env create -f environment.yml
conda activate myddpg

mkdir exp
mkdir exp/log
mkdir exp/summary
mkdir exp/checkpoint
mkdir exp/visualization

chmod +x ./scripts/tn
./scripts/tn.sh
# for visualization in tensorboard
tensorboard --logdir='exp/summary' --port=6006

Visualization

Rewards with training step

rewards

rewards

rewards

Density plot showing estimated Q values versus observed returns sampled from test episodes

density

density

density

References

https://github.com/vitchyr/rlkit/blob/master/rlkit/exploration_strategies/ou_strategy.py

https://arxiv.org/pdf/1509.02971v6.pdf

http://proceedings.mlr.press/v32/silver14.pdf

https://www.datascienceassn.org/sites/default/files/Human-level%20Control%20Through%20Deep%20Reinforcement%20Learning.pdf