TF2RL is a deep reinforcement learning library that implements various deep reinforcement learning algorithms using TensorFlow 2.x.
Following algorithms are supported:
Algorithm | Dicrete action | Continuous action | Support | Category |
---|---|---|---|---|
VPG, PPO | ✓ | ✓ | GAE | Model-free On-policy RL |
DQN (including DDQN, Prior. DQN, Duel. DQN, Distrib. DQN, Noisy DQN) | ✓ | - | ApeX | Model-free Off-policy RL |
DDPG (including TD3, BiResDDPG) | - | ✓ | ApeX | Model-free Off-policy RL |
SAC | ✓ | ✓ | ApeX | Model-free Off-policy RL |
MPC | ✓ | ✓ | - | Model-base RL |
GAIL, GAIfO, VAIL (including Spectral Normalization) | ✓ | ✓ | - | Imitation Learning |
Following papers have been implemented in tf2rl:
- Model-free On-policy RL
- Model-free Off-policy RL
- Playing Atari with Deep Reinforcement Learning, code
- Human-level control through Deep Reinforcement Learning, code
- Deep Reinforcement Learning with Double Q-learning, code
- Prioritized Experience Replay, code
- Dueling Network Architectures for Deep Reinforcement Learning, code
- A Distributional Perspective on Reinforcement Learning, code
- Noisy Networks for Exploration, code
- Distributed Prioritized Experience Replay, code
- Continuous control with deep reinforcement learning, code
- Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor, Soft Actor-Critic Algorithms and Applications, code
- Addressing Function Approximation Error in Actor-Critic Methods, code
- Deep Residual Reinforcement Learning, code
- Soft Actor-Critic for Discrete Action Settings, code
- Model-base RL
- Imitation Learning
Also, some useful techniques are implemented:
You can install tf2rl
from PyPI:
$ pip install tf2rl
or, you can also install from source:
$ git clone https://github.com/keiohta/tf2rl.git tf2rl
$ cd tf2rl
$ pip install .
Here is a quick example of how to train DDPG agent on a Pendulum environment:
import gym
from tf2rl.algos.ddpg import DDPG
from tf2rl.experiments.trainer import Trainer
parser = Trainer.get_argument()
parser = DDPG.get_argument(parser)
args = parser.parse_args()
env = gym.make("Pendulum-v0")
test_env = gym.make("Pendulum-v0")
policy = DDPG(
state_shape=env.observation_space.shape,
action_dim=env.action_space.high.size,
gpu=-1, # Run on CPU. If you want to run on GPU, specify GPU number
memory_capacity=10000,
max_action=env.action_space.high[0],
batch_size=32,
n_warmup=500)
trainer = Trainer(policy, env, args, test_env=test_env)
trainer()
You can check implemented algorithms in examples. For example if you want to train DDPG agent:
# You must change directory to avoid importing local files
$ cd examples
# For options, please specify --help or read code for options
$ python run_ddpg.py [options]
You can see the training progress/results from TensorBoard as follows:
# When executing `run_**.py`, its logs are automatically generated under `./results`
$ tensorboard --logdir results