vwxyzjn/cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

PythonNOASSERTION

Pinned issues

PPO improvements

#206 opened 6 months ago

Closed0

Roadmap for CleanRL

#115 opened 6 months ago

Closed0

Issues

PPO Complex Obs/Action Space
#353 opened a year ago
3
About PPO+Procgen code on Jax
#352 opened 6 months ago
7
Reproduction of Muesli
#350 opened 6 months ago
23
Add Polyak update to DQN
#346 opened a year ago
1
Deprecate `ppo_procgen.py` in favor of EnvPool
#340 opened 6 months ago
2
What is the reason for returning mean in SAC get_action function if it's never used?
#333 opened a year ago
1
Cleanrl for MARL
#330 opened 6 months ago
15
GitPod instance errors out when running poetry install
#325 opened 2 years ago
3
Typo in c51.py
#324 opened 2 years ago
1
Target network isn't updated to the correct frequency when `target_network_frequency % train_frequency != 0`
#322 opened 2 years ago
0
Benchmark `dqn_jax.py` using CPU only
#317 opened a year ago
0
Why is there no design evaluation and save model module?
#310 opened 2 years ago
22
DDPG JAX breaks with python ~3.7
#309 opened 6 months ago
2
unable to render video in gitpod
#305 opened 2 years ago
0
SAC Implementation Details
#304 opened 2 years ago
0
cuda with SAC
#303 opened 2 years ago
1
Action bias is added twice in DDPG algorithm implementation, similar to #259
#297 opened 2 years ago
0
RLops Guide
#296 opened a year ago
1
ppo+lstm train continuous environments
#290 opened 2 years ago
8
Re-benchmarking refactored algorithms
#289 opened 2 years ago
1
Requirments - requirements-pettingzoo.txt
#283 opened 2 years ago
3
Problem with multi-agent atari
#280 opened 2 years ago
2
TD3 policy noise bugs
#279 opened 2 years ago
2
Are you interested in PRs for improvements in performance of PPO LSTM script?
#276 opened 2 years ago
3
SAC discrete
#266 opened 6 months ago
3
Multi-objective hyperparameter optimization
#265 opened 2 years ago
3
Upgrade gym version to 0.26.1
#263 opened 6 months ago
2
Action bias is added twice in TD3 algorithm implementation
#259 opened 2 years ago
2
Add TQC to CleanRL
#258 opened 2 years ago
5
Data corruption due to run naming convention when running on Slurm/GridEngine
#256 opened 2 years ago
1
DQN on MountainCar
#255 opened 2 years ago
3
Adding unit tests
#252 opened 2 years ago
0
Poetry install fails with "isaacgymenvs (rev poetry) is not satisfied"
#251 opened 2 years ago
6
Adding Double DQN
#250 opened 2 years ago
1
RL Formulation
#249 opened 2 years ago
1
Adding Hierarchical RL Algorithms
#248 opened 6 months ago
7
Poetry can't install torch nightly
#247 opened 2 years ago
2
Adding TRPO implementation
#245 opened 6 months ago
5
AsyncVectorEnv
#244 opened 2 years ago
3
A question about the `PPO` algorithm
#240 opened 2 years ago
5
Replace cloud utilities w/ `torchx`
#239 opened 6 months ago
1
Slow `poetry` dependency locking time, and resolution
#237 opened 6 months ago
0
JAX + C51
#221 opened a year ago
0
JAX + DQN
#220 opened 2 years ago
1
JAX + TD3
#219 opened 2 years ago
1
JAX Integration with CleanRL
#218 opened 2 years ago
0
Prototype TD3 with JAX
#216 opened 2 years ago
1
PPO with Humanoid
#215 opened 2 years ago
2
Adding Average Reward PPO proposal
#210 opened 6 months ago
3
Remove the value function clipping
#208 opened 6 months ago
0