DLR-RM/stable-baselines3

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

PythonMIT

Pinned issues

[Feature Request] RAINBOW

#622 opened 3 years ago by araffin

Open7

[Feature Request] torch compile / integrating intel extension for pytorch

#1564 opened 2 years ago by george-adams1

Open6

Issues

[Bug]: Last step in environment will not be saved to replay buffer, when using a callback
#2056 opened a month ago by JaMueDFKI
4
[Bug]: Video upload to wandb broken since 2.4.0
#2055 opened a month ago by OliverUrbann
4
[Question] Can a model be used in environments with different observation_space sizes?
#2031 opened a month ago by SummerDiver
7
[Question] Can stable-baselines3 be installed through pip without cuda dependencies? Is the CPU only docker image the only alternative?
#2019 opened a month ago by joemc94
1
[Question] Batch Size Selection for a Finite MDP
#2024 opened a month ago by DavidLudl
4
[Bug]: Using the start value of Discrete spaces has no effect
#2052 opened a month ago by JoshuaBluem
5
[Question] Error Installing stable-baselines3[extra] on Windows
#2033 opened a month ago by bajramienes
5
[Bug]: Error on utils.get_system_info
#2053 opened a month ago by itwasabhi
4
policy_kwargs not documented in DQN
#2035 opened a month ago by pstahlhofen
7
[Question] Cannot reproduce results of "EvalCallback" gathered during training.
#2036 opened 2 months ago by felix-basiliskroko
2
[Question] How to customize the loss calculation for PPO
#2028 opened 2 months ago by olmoulin
3
[Bug] importing stable baselines 3 on linux and windows directory issue
#2034 opened 2 months ago by Feelfeel20088
2
Issue in forward(....) function of class ActorCriticPolicy while working on Custom Gym Environment.
#2043 opened 2 months ago by SachinVashisth
8
How can i change a Distribution?
#2032 opened 2 months ago by CAI23sbP
2
[Question] WSL Freeze with Rocm
#2021 opened 3 months ago by darkavatar23
3
[Question] Question on scaling: I am using GCN as a FE to train on a small observation space and test it on a larger observation space
#2042 opened 2 months ago by hsaseendran
0
[Bug]: unable to learn MountainCarContinuous-v0
#2038 opened 2 months ago by tesla-cat
0
[bug] Adaptive SAC: using logarithm of entropy coefficient to compute temperature objective instead of entropy coefficient
#2013 opened 2 months ago by Mattia-sony
1
[Bug]: Episode start flag is never set for off policy algorithms
#2011 opened 2 months ago by josndan
1
[Feature Request] request title jax API and gymnax
#2020 opened 2 months ago by quanmissq
1
[Question] Manually Controlling Actions During PPO Training
#2014 opened 2 months ago by wayne-weiwei
2
[Bug]: When using SubprocVecEnv for parallel training of agents, the ep_rew_mean is no longer being recorded.
#2027 opened 2 months ago by didu11
1
[Question] TD3 algorithm， During training，why limit the next_actions
#2022 opened 2 months ago by Danny551
1
[Feature Request] When are you planning to upgrade to Gymnasium v1.0.0
#2023 opened 2 months ago by drulye
2
[Bug]: No metrics logged when using wandb integrations
#1995 opened 4 months ago by XiaobenLi00
4
[Question] Can I load a RL model with Mac trained on windows platform?
#2029 opened 3 months ago by Ssstirm
5
[Bug]: How to avoid saving an external LLM model while saving a cutomized dqn policy
#2025 opened 3 months ago by chrisgao99
2
[Question] How to change my loaded model's "exploration_initial_eps", "exploration_final_eps", "exploration_fraction" parameters?
#2018 opened 3 months ago by Melih96
2
[Question] Auto-regressive manner policy network?
#2010 opened 3 months ago by wadmes
1
Is possible to filter experience from the episode whose length is longer than a specified value to add into replay_buffer?
#1999 opened 3 months ago by CornfileChase
1
[Question] About the logger
#1997 opened 3 months ago by XiaobenLi00
1
Logger information
#1998 opened 3 months ago by XiaobenLi00
1
[Feature Request] Warn users when using GPU with `A2C`/`PPO` + update documentation
#2012 opened 3 months ago by jws-1
5
can not load PPO model when I use custom net_arch
#2015 opened 3 months ago by krishdotn1
1
[Bug]: obs_as_tensor RuntimeError with numpy==2.1.1 on MacOS13
#2016 opened 3 months ago by nquetschlich
2
[Question] The entropy value is a negative number, and the entropy loss is a positive number.
#2004 opened 4 months ago by YSAA1
3
A succesfully ppo trained agent demands some steps of re-training to make good predictions
#2009 opened 4 months ago by tanielsfranklin
1
[Feature Request] Safe Reinforcement Learning & Multi-Objective Reinforcement Learning
#2008 opened 4 months ago by cherrywoods
2
[Question] Shared feature extractor and gradient
#2006 opened 4 months ago by brn-dev
2
[Question] Issues withe monitor not having seed parameter
#2002 opened 4 months ago by AbhayGoyal
2
[Bug]: bug title SubprocVecEnv TypeError: reset() got an unexpected argument 'seed'
#2001 opened 4 months ago by ccleavinger
1
[Question] MARL using Stable Baselines3
#2007 opened 4 months ago by Hamza-101
1
[Feature Request] add_scalars to wirte func in TensorBoardOutputFormat in logger
#1994 opened 5 months ago by shimonShouei
2
[Feature Request] Add support for optional environment wrapping in base_class.py
#1996 opened 5 months ago by hasan-yaman
2
Clarification on Dependency Between Elements in Action Generation
#1989 opened 5 months ago by fardinabbasi
4
[Question] Passing arguments to an environment that can't be pickled.
#1990 opened 5 months ago by abhineet-gupta
3
observation_space does not match reset() observation and The environment is being initialised with render_mode='human' that is not in the possible render_modes ([])
#1992 opened 5 months ago by XiaobenLi00
5
That WORKED !!
#1991 opened 5 months ago by XiaobenLi00
0
[Question] About the output layer of algorithms
#1988 opened 5 months ago by abdulkadrtr
1
[Question] New to this and can't get it installed plz help
#1987 opened 5 months ago by Misticfury
1