TARS-RL

Distributed Reinforcement Learning Framework.

Algorithms

DDPG pdf
C51 (Categorical DDPG) pdf
QR-DQN (Quantile DDPG) pdf
Soft Actor-Critic (SAC) pdf
TD3 pdf
Quantile TD3
Ensemble of algorithms (use same batch for training)

Features

Client-Server architecture (you don't need incorporate RL framework into your environment, just use client)
Server collects experience and to training
Arbitrary number of parallel agents (clients) can send gathered experience to server over network
All hyperparameters in one file
Different exploration parameters for every agent
Easy to implement new algorithms
Support any gym compatible environment out of the box
Python 3.6

Example envs

OpenAI Gym:

Bipedal Walker both simple and hardcore see
Lunar Lander see
Pendulum see

Challanges:

NeurIPS 2017: Learning To Run see
NeurIPS 2018: AI for Prosthetics Challenge see

Documentation

See config file description

Installation

Step 0. Install anaconda with python 3.6 from download page or see archived versions (Optional, but highly recommended)

1. Clone repo
$ git clone

2. Add to PATH your anaconda
$ export PATH=/path/to/your/anaconda/bin/:$PATH

3. Install requirements
$ pip install tensorflow
or
$ pip install tensorflow-gpu
if you have supported by tensorflow GPU 

$ pip install tensorboardX

4. For OpenAI gym examples
$ pip install gym['box2d']

or see how to install all Gym envs
https://github.com/openai/gym

How to run

$ cd root/of/tars-rl

run server (Lunar Lander config as an example)
$ python -m rl_server.server.run_server --config experiments/lunar_lander/config_ddpg.yml

run agents
(7 parallel agents on your computer,
 supposed you have CPU with 8 threads)
$ CUDA_VISIBLE_DEVICES="" python -m rl_server.server.run_agents --config experiments/lunar_lander/config_ddpg.yml
 
CUDA_VISIBLE_DEVICES=""
is needed if you don't want agents
to not interrupt server train operations

run trained policy from a checkpoint without server
python -m rl_server.server.play --config path/to/config.yml --checkpoint path/to/model-10000.ckpt --seed 1234

Credits

Oleksii Hrinchuk, e-mail, github
Anton Pechenko, github, linkedin, youtube
Sergey Kolesnikov, github

References

Continuous Control with Deep Reinforcement Learning (DDPG) (pdf).
A Distributional Perspective on Reinforcement Learning (C51) (pdf).
Distributional Reinforcement Learning with Quantile Regression (QR-DQN) (pdf).
Soft Actor-Critic: Off-Policy Maximum Entropy Deep RL with a Stochastic Actor (SAC-GMM) (pdf).
Addressing Function Approximation Error in Actor-Critic Methods (TD3) (pdf).
Layer Normalization (pdf)
Parameter Space Noise for Exploration (pdf)
Noisy Networks for Exploration (pdf)

Roadmap

Train envs, make videos, write docs
Release TARS-RL
Add HER
Support pytorch
Add self-play

parilo/tars-rl

TARS-RL

Algorithms

Features

Example envs

Documentation

Installation

How to run

Credits

References

Roadmap