Status: Maintenance (expect bug fixes and minor updates)
This is a clone from Spinning Up with the goal for using the latest pytorch version. Tensorflow-based codes were removed from the original repository. For beginners, please get started at spinningup.openai.com!
add different activation functions for pi and v:
- VPG
- TRPO
- PPO
- DDPG
- TD3
git clone git@github.com:haha1227/spinningup-pytorch.git
cd spinningup-pytorch
pip install -e .
To uninstall it, run the following:
pip uninstall spinup
python -m spinup.run [algo name] [experiment flags]
E.g.:
python3 -m spinup.run ppo --env CartPole-v1 --exp_name walker --hid [64,64] --pi_act torch.nn.Tanh
or run the same algorithm with many possible hyperparameters:
python3 -m spinup.run ppo --env CartPole-v1 --exp_name walker --hid [32] [64,64] --pi_act torch.nn.Tanh torch.nn.ReLU
from spinup import ppo_pytorch as ppo
import gym
import torch
seed = 0
exp_name = 'walker'
env_fn = lambda : gym.make('CartPole-v1')
ac_kwargs = dict(hidden_sizes=[64,64], pi_output_activation=torch.nn.Tanh)
logger_kwargs = dict(output_dir=''.join([exp_name, '_s', str(seed)]), exp_name=exp_name)
ppo(env_fn=env_fn, ac_kwargs=ac_kwargs, steps_per_epoch=5000, epochs=250, seed = seed, logger_kwargs=logger_kwargs)
from spinup.utils.run_utils import ExperimentGrid
from spinup import ppo_pytorch
import torch
if __name__ == '__main__':
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--cpu', type=int, default=4)
parser.add_argument('--num_runs', type=int, default=3)
args = parser.parse_args()
eg = ExperimentGrid(name='ppo-pyt-bench')
eg.add('env_name', 'CartPole-v1', '', True)
eg.add('seed', [10*i for i in range(args.num_runs)])
eg.add('epochs', 10)
eg.add('steps_per_epoch', 4000)
eg.add('ac_kwargs:hidden_sizes', [(32,), (64,64)], 'hid')
eg.add('ac_kwargs:activation', [torch.nn.Tanh, torch.nn.ReLU], '')
eg.run(ppo_pytorch, num_cpu=args.cpu)
Note: the default trained model is saved in the data folder.
python -m spinup.run test_policy path/to/output_directory
or using scripts:
from spinup.utils.test_policy import load_policy_and_env, run_policy
import your_env
_, get_action = load_policy_and_env('/path/to/output_directory')
env = your_env.make()
run_policy(env, get_action)
python -m spinup.run plot [path/to/output_directory ...] [--legend [LEGEND ...]]
[--xaxis XAXIS] [--value [VALUE ...]] [--count] [--smooth S]
[--select [SEL ...]] [--exclude [EXC ...]]
Please check the details.
If you reference or use Spinning Up in your research, please cite:
@article{SpinningUp2018,
author = {Achiam, Joshua},
title = {{Spinning Up in Deep Reinforcement Learning}},
year = {2018}
}