Benchmark data (i.e., DeepMind Control Suite and MuJoCo) for RL.
All baseline algorithms are running based on the code repository from: ① Spinning Up repository / ② Fujimoto TD3 repository / ③ QingLi Implementation.
Baseline algorithms are listed as below:
- Deep Deterministic Policy Gradients (DDPG)
- Proximal Policy Optimization (PPO)
- Soft Actor-Critic (SAC)
- Twin Delayed Deep Deterministic Policy Gradients (TD3)
# eg. Notice: `-l` denotes labels, and `-s` represents smoothing value.
python spinupUtils/plot.py \
MuJoCo-3M/SpinningUp/DDPG/DDPG-Hopper-v2 \
MuJoCo-3M/SpinningUp/PPO/PPO-Hopper-v2 \
MuJoCo-3M/SpinningUp/TD3/TD3-Hopper-v2 \
MuJoCo-3M/SpinningUp/SAC/SAC-Hopper-v2 \
--env Hopper-v2 \
-l DDPG PPO TD3 SAC -s 10
Including Ant-v2
, HalfCheetah-v2
, Hopper-v2
, Humanoid-v2
, Swimmer-v2
, Walker2d-v2
.
- Code of baseline algorithms is from Spinning Up repository, the agents are running for 3 million time steps.
- Code of baseline algorithms is from QingLi Implementation, the agents are running for 3 million time steps.
Including Ant-v2
, HalfCheetah-v2
, Hopper-v2
, Humanoid-v2
, Swimmer-v2
, Walker2d-v2
.
- Code of baseline algorithms is from Fujimoto TD3 repository, the agents are running for 1 million time steps by default.
Including acrobot-swingup
, ball_in_cup-catch
, cartpole-swingup
, cartpole-swingup_sparse
, cartpole-three_poles
, cartpole-two_poles
, cheetah-run, finger-spin
, finger-spin
, finger-turn_easy
, finger-turn_hard
, fish-swim
, hopper-hop, hopper-stand
, humanoid-run
, humanoid-run_pure_state
, humanoid-stand
, pendulum-swingup
, point_mass-easy
, point_mass-hard
, quadruped-fetch
, quadruped-run
, quadruped-walk
, swimmer-swimmer6
, swimmer-swimmer15
, walker-run
.
- Code of baseline algorithms is from Spinning Up repository, the agents are running for 3 million time steps.
@misc{QingLi2021continuousbenchmark,
author = {Qing Li},
title = {Continuous Control Benchmark of DeepMind Control Suite and MuJoCo},
year = {2021},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/LQNew/Continuous_Control_Benchmark}}
}