hill-a/stable-baselines

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

PythonMIT

Pinned issues

Tensorflow 2.0 support?

#366 opened 4 years ago by heron1

Closed20

V3 new backend: PyTorch? and the future of Stable Baselines

#733 opened 3 years ago by araffin

Closed10

Issues

Using Multiple environment with unity ML agents
#1193 opened 3 months ago by 871234342
0
Load model to Re-train. " 'NoneType' object has no attribute 'reset' "
#1149 opened 2 years ago by don-mao
4
MlpPolicy network output layer softmax activation for continuous action space problem?
#1190 opened 5 months ago by wbzhang233
2
Resume Training on Checkpoint
#1188 opened 10 months ago by mingjohnson
1
resume the training
#1192 opened 4 months ago by rambo1111
0
Two similar custom environment, PPO learns on both but SAC only on one
#1191 opened 4 months ago by tfederico
1
Model generated using VecNormalize, model predict use case
#1189 opened 10 months ago by madhekar
1
VecNormalize 'training' attribute
#1187 opened 10 months ago by madhekar
0
Customize training process
#1186 opened 10 months ago by dunhuiliu
0
assertionerror, observation space , vectransposeimage
#1185 opened 10 months ago by skgr07
0
Dimension mismatch with when using custom Feature Extractor
#1184 opened a year ago by yassinetb
0
[question] How to get the model architecture when using recurrent policy?
#1145 opened 3 years ago by borninfreedom
8
[Question] Can I train agents in a nested loop in SB3?
#1183 opened a year ago by j-thib
0
python setup.py egg_info did not run successfully.
#1182 opened a year ago by perp1exed
0
[question] TypeError: 'NoneType' object is not callable with user defined env
#1181 opened a year ago by Charles-Lim93
0
DQN report. [QUESTION]
#1180 opened a year ago by smbrine
0
#question _on_step method in custom callback
#1179 opened a year ago by vrige
0
Can I use an agent with act, and observe interactions with no/minimum use of environment?
#1178 opened a year ago by aheidariiiiii1993
0
How to convert timestep based learning to episodic learning
#1175 opened 2 years ago by muk465
1
How to create an actor-critic network with two separate LSTMs
#1177 opened 2 years ago by ashleychung830
0
[question] PPO load using .pkl file
#1176 opened 2 years ago by meric-sakarya
1
TypeError: can't pickle dolfin.cpp.geometry.Point objects
#1174 opened 2 years ago by jiangzhangze
0
Custom gym Env Assertation error regarding reset () method
#1173 opened 2 years ago by sheila-janota
0
[question] for an RL algorithm with a discrete action space, is it possible to get a probability of outcomes when feeding in data?
#1171 opened 2 years ago by george-adams1
0
[Question]Callback collected model does not have same reward as training verbose[custom gym environment]
#1170 opened 2 years ago by hotpotking-lol
1
Store the training result
#1169 opened 2 years ago by hotpotking-lol
2
Environment checker returns assertion error contradicting debug statements
#1168 opened 2 years ago by techboy-coder
2
True rewards remaining "zero" in the trajectories in stable baselines2 for custom environments
#1167 opened 2 years ago by moizuet
6
Deep Q-value network evaluation in SAC algorithm
#1166 opened 2 years ago by moizuet
2
SAC results with large variance
#1150 opened 2 years ago by dibbla
1
Link to gym docs on creating cusotm environment broken
#1165 opened 2 years ago by arjun-krishna1
0
1D Vector of floats as an observation space
#1164 opened 2 years ago by WilliamFlinchbaugh
3
model.num_timesteps or a similar method inside a SubprocVecEnv
#1163 opened 2 years ago by olyanos
2
customenv return from reset does not match the observation space
#1162 opened 2 years ago by RishiKasam
1
FPS varies enormously
#1161 opened 2 years ago by leo2r
4
Cannot install stable baselines 3
#1160 opened 2 years ago by OishikGuha
4
Accessing observations during training aka .learn()
#1159 opened 2 years ago by user-1701
2
Is it wrong to reward an action on the next step?
#1158 opened 2 years ago by DaniilKardava
1
Data normalization for a2c inputs?
#1157 opened 2 years ago by DaniilKardava
1
Prediction for same observation using same model
#1156 opened 2 years ago by DaniilKardava
1
Invalid Actions, Mask and DQN
#1155 opened 2 years ago by Cyazd
3
Problem retraining PPO1 model and using Tensorflow with Stable Baselines 2
#1154 opened 2 years ago by durantagre
1
Minigrid --" Kernel size can't be greater than actual input size" for DQN
#1153 opened 2 years ago by raymond2338
1
Running Stable Baselines on M1 Macs?
#1152 opened 2 years ago by adamnhaka
1
Unable to see stable-baselines output
#1151 opened 2 years ago by Michael-HK
7
[question]A problem of how to use MlpLstmPolicy in GAIL training?
#1148 opened 2 years ago by LongchaoDa
3
[question] How do I load a tensorflow ckpt?
#1147 opened 2 years ago by Syzygianinfern0
4
[question] How to use previously obtained state-action-reward-next state information to save time on training?
#1146 opened 2 years ago by kwak9601
1
import stable-baselines [Question]
#1144 opened 3 years ago by MariaPiaGelos
1
PPO ValueError: The parameter loc has invalid values
#1143 opened 3 years ago by olyanos
1