/d4pg-pytorch

PyTorch implementation of Distributed Distributional Deterministic Policy Gradients (https://arxiv.org/abs/1804.08617)

Primary LanguagePython

D4PG-pytorch

PyTorch implementation of Distributed Distributional Deterministic Policy Gradients (https://arxiv.org/abs/1804.08617). arch

Implementation was tested on environments from OpenAI Gym.

About

D4PG and D3PG implementations with following features

  • learner, sampler and agents run in separate processes
  • exploiter agent(s) exists which acts without noise in actions on target network
  • GPU is hold only by exploiters, all other exploration processes are run on CPU

Project was tested on Ubuntu 18.04, Intel i5 with 4 cores, Nvidia GTX 1080Ti

Usage

Run train.py --config configs/pendulum_d4pg.yml

Tests

python -m unittest discover

Results

plot

Reproduce

All results were obtained with configs in configs directory

References