DLR-RM/stable-baselines3
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
PythonMIT
Pinned issues
Issues
- 2
- 1
How can i change a Distribution?
#2032 opened by CAI23sbP - 1
[Bug] importing stable baselines 3 on linux and windows directory issue
#2034 opened by Feelfeel20088 - 2
- 4
- 2
[Question] Can a model be used in environments with different observation_space sizes?
#2031 opened by SummerDiver - 5
- 4
[Question] Batch Size Selection for a Finite MDP
#2024 opened by DavidLudl - 1
- 3
[Question] How to customize the loss calculation for PPO
#2028 opened by olmoulin - 0
[Bug]: When using SubprocVecEnv for parallel training of agents, the ep_rew_mean is no longer being recorded.
#2027 opened by didu11 - 2
[Bug]: How to avoid saving an external LLM model while saving a cutomized dqn policy
#2025 opened by chrisgao99 - 2
[Question] WSL Freeze with Rocm
#2021 opened by darkavatar23 - 1
[Feature Request] request title jax API and gymnax
#2020 opened by quanmissq - 2
[Question] How to change my loaded model's "exploration_initial_eps", "exploration_final_eps", "exploration_fraction" parameters?
#2018 opened by Melih96 - 1
[Question] Can stable-baselines3 be installed through pip without cuda dependencies? Is the CPU only docker image the only alternative?
#2019 opened by joemc94 - 1
[Question] Auto-regressive manner policy network?
#2010 opened by wadmes - 1
Is possible to filter experience from the episode whose length is longer than a specified value to add into replay_buffer?
#1999 opened by CornfileChase - 1
[Question] About the logger
#1997 opened by XiaobenLi00 - 1
Logger information
#1998 opened by XiaobenLi00 - 5
[Feature Request] Warn users when using GPU with `A2C`/`PPO` + update documentation
#2012 opened by jws-1 - 1
can not load PPO model when I use custom net_arch
#2015 opened by krishdotn1 - 2
- 2
- 1
[bug] Adaptive SAC: using logarithm of entropy coefficient to compute temperature objective instead of entropy coefficient
#2013 opened by Mattia-sony - 3
[Question] The entropy value is a negative number, and the entropy loss is a positive number.
#2004 opened by YSAA1 - 1
- 1
A succesfully ppo trained agent demands some steps of re-training to make good predictions
#2009 opened by tanielsfranklin - 2
[Feature Request] Safe Reinforcement Learning & Multi-Objective Reinforcement Learning
#2008 opened by cherrywoods - 2
[Question] Shared feature extractor and gradient
#2006 opened by brn-dev - 2
- 1
[Bug]: bug title SubprocVecEnv TypeError: reset() got an unexpected argument 'seed'
#2001 opened by ccleavinger - 1
[Question] MARL using Stable Baselines3
#2007 opened by Hamza-101 - 2
[Feature Request] add_scalars to wirte func in TensorBoardOutputFormat in logger
#1994 opened by shimonShouei - 2
[Feature Request] Add support for optional environment wrapping in base_class.py
#1996 opened by hasan-yaman - 4
- 3
- 5
observation_space does not match reset() observation and The environment is being initialised with render_mode='human' that is not in the possible render_modes ([])
#1992 opened by XiaobenLi00 - 0
That WORKED !!
#1991 opened by XiaobenLi00 - 1
[Вопрос] how to train 2 models in parallel?
#1982 opened by kozlolet - 1
[Question] About the output layer of algorithms
#1988 opened by abdulkadrtr - 4
Custom actor and critic network
#1985 opened by krishdotn1 - 1
- 2
- 1
[Feature Request] Temporal Convolutional network
#1984 opened by tty666 - 2
- 1
[Bug]: 'CarRacing' object has no attribute 'num_envs'
#1983 opened by kuds - 1
- 2
[Question] Questions about CNN policy input channel
#1977 opened by DavidLudl - 1
[Bug]: Load a PPO model and re-start learning
#1974 opened by nrigol