payamn/d4pg-pytorch

PyTorch implementation of Distributed Distributional Deterministic Policy Gradients (https://arxiv.org/abs/1804.08617)

Python

D4PG-pytorch

PyTorch implementation of Distributed Distributional Deterministic Policy Gradients (https://arxiv.org/abs/1804.08617). Supported environments

Pendulum-v0
LunarLanderContinous-v2
BipedalWalker-v2

Usage

Run train.py to run experiment specified in config.yaml.

Tests

In progress, for now tests can be used for reproducing results.

Demo

Detailed results of training can be found at

Pendulum-v0
- d3pg, d3pg prioritized
- d4pg, d4pg prioritized
LunarLanderContinuous-v2
- d3pg, d3pg prioritized
- d4pg, d4pg prioritized
BipedalWalker-v2
- d3pg, d3pg prioritized
- d4pg, d4pg prioritized

Acknowledgements

The project partly based on the Mark Sinton TensorFlow implementation, which helped greatly in difficult parts.