Distributed Reinforcement Learning Framework.
- DDPG pdf
- C51 (Categorical DDPG) pdf
- QR-DQN (Quantile DDPG) pdf
- Soft Actor-Critic (SAC) pdf
- TD3 pdf
- Quantile TD3
- Ensemble of algorithms (use same batch for training)
- Client-Server architecture (you don't need incorporate RL framework into your environment, just use client)
- Server collects experience and to training
- Arbitrary number of parallel agents (clients) can send gathered experience to server over network
- All hyperparameters in one file
- Different exploration parameters for every agent
- Easy to implement new algorithms
- Support any gym compatible environment out of the box
- Python 3.6
OpenAI Gym:
Challanges:
Step 0. Install anaconda with python 3.6 from download page or see archived versions (Optional, but highly recommended)
1. Clone repo
$ git clone
2. Add to PATH your anaconda
$ export PATH=/path/to/your/anaconda/bin/:$PATH
3. Install requirements
$ pip install tensorflow
or
$ pip install tensorflow-gpu
if you have supported by tensorflow GPU
$ pip install tensorboardX
4. For OpenAI gym examples
$ pip install gym['box2d']
or see how to install all Gym envs
https://github.com/openai/gym
$ cd root/of/tars-rl
run server (Lunar Lander config as an example)
$ python -m rl_server.server.run_server --config experiments/lunar_lander/config_ddpg.yml
run agents
(7 parallel agents on your computer,
supposed you have CPU with 8 threads)
$ CUDA_VISIBLE_DEVICES="" python -m rl_server.server.run_agents --config experiments/lunar_lander/config_ddpg.yml
CUDA_VISIBLE_DEVICES=""
is needed if you don't want agents
to not interrupt server train operations
run trained policy from a checkpoint without server
python -m rl_server.server.play --config path/to/config.yml --checkpoint path/to/model-10000.ckpt --seed 1234
- Oleksii Hrinchuk, e-mail, github
- Anton Pechenko, github, linkedin, youtube
- Sergey Kolesnikov, github
- Continuous Control with Deep Reinforcement Learning (DDPG) (pdf).
- A Distributional Perspective on Reinforcement Learning (C51) (pdf).
- Distributional Reinforcement Learning with Quantile Regression (QR-DQN) (pdf).
- Soft Actor-Critic: Off-Policy Maximum Entropy Deep RL with a Stochastic Actor (SAC-GMM) (pdf).
- Addressing Function Approximation Error in Actor-Critic Methods (TD3) (pdf).
- Layer Normalization (pdf)
- Parameter Space Noise for Exploration (pdf)
- Noisy Networks for Exploration (pdf)
- Train envs, make videos, write docs
- Release TARS-RL
- Add HER
- Support pytorch
- Add self-play