PyTorch implementation of Distributed Distributional Deterministic Policy Gradients (https://arxiv.org/abs/1804.08617). Supported environments
- Pendulum-v0
- LunarLanderContinous-v2
- BipedalWalker-v2
Run train.py
to run experiment specified in config.yaml
.
In progress, for now tests can be used for reproducing results.
Detailed results of training can be found at
- Pendulum-v0
- LunarLanderContinuous-v2
- BipedalWalker-v2
The project partly based on the Mark Sinton TensorFlow implementation, which helped greatly in difficult parts.