Continuous_Control_Benchmark: A Python repository from LQNew

Benchmark data (i.e., DeepMind Control Suite and MuJoCo) for RL.
All baseline algorithms are running based on the code repository from: ① Spinning Up repository / ② Fujimoto TD3 repository / ③ QingLi Implementation.
Baseline algorithms are listed as below:

Plot results

# eg. Notice: `-l` denotes labels, and `-s` represents smoothing value.
python spinupUtils/plot.py \
    MuJoCo-3M/SpinningUp/DDPG/DDPG-Hopper-v2 \
    MuJoCo-3M/SpinningUp/PPO/PPO-Hopper-v2 \
    MuJoCo-3M/SpinningUp/TD3/TD3-Hopper-v2 \
    MuJoCo-3M/SpinningUp/SAC/SAC-Hopper-v2 \
    --env Hopper-v2 \
    -l DDPG PPO TD3 SAC -s 10

MuJoCo-3M

Including Ant-v2, HalfCheetah-v2, Hopper-v2, Humanoid-v2, Swimmer-v2, Walker2d-v2.

Code of baseline algorithms is from Spinning Up repository, the agents are running for 3 million time steps.

Code of baseline algorithms is from QingLi Implementation, the agents are running for 3 million time steps.

MuJoCo-1M

Including Ant-v2, HalfCheetah-v2, Hopper-v2, Humanoid-v2, Swimmer-v2, Walker2d-v2.

Code of baseline algorithms is from Fujimoto TD3 repository, the agents are running for 1 million time steps by default.

DMControlSuite-3M

Including acrobot-swingup, ball_in_cup-catch, cartpole-swingup, cartpole-swingup_sparse, cartpole-three_poles, cartpole-two_poles, cheetah-run, finger-spin, finger-spin, finger-turn_easy, finger-turn_hard, fish-swim, hopper-hop, hopper-stand, humanoid-run, humanoid-run_pure_state, humanoid-stand, pendulum-swingup, point_mass-easy, point_mass-hard, quadruped-fetch, quadruped-run, quadruped-walk, swimmer-swimmer6, swimmer-swimmer15, walker-run.

Code of baseline algorithms is from Spinning Up repository, the agents are running for 3 million time steps.

citation

@misc{QingLi2021continuousbenchmark,
  author = {Qing Li},
  title = {Continuous Control Benchmark of DeepMind Control Suite and MuJoCo},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/LQNew/Continuous_Control_Benchmark}}
}