ikostrikov/pytorch-a2c-ppo-acktr-gail
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
PythonMIT
Issues
- 3
- 3
get different results when I set the same seed
#237 opened by yunke-wang - 0
Updates: Support the latest Atari environment and state entropy maximization-based exploration.
#296 opened by yuanmingqi - 0
Why didn't run to generate log?
#294 opened by Can-no - 0
- 0
Where are the experts data for GAIL get from?
#292 opened by YY-GX - 0
- 1
Oops! wrong repo :-D
#290 opened by andyk - 1
question about the recurrent
#289 opened by rainbow979 - 0
- 1
why PPO needs to store action_log_probs instead of using stop_gradient for better efficiency?
#284 opened by Emerald01 - 0
object has no attribute 'steps' in acktr
#283 opened by sungreong - 2
Combine Acktr model with grad-cam
#268 opened by seed851218 - 0
No softmax before categorical loss?
#282 opened by nirweingarten - 0
Operations that have no effect
#281 opened by ArashVahabpour - 0
CNN Architecture
#280 opened by araffin - 0
- 0
Stale hidden states
#278 opened by aklein1995 - 0
Can not run enjoy.py
#277 opened by juanjuan2 - 0
Can I train in my own game
#276 opened by hhhcwb38712 - 0
- 0
observation reset before insert
#274 opened by seed851218 - 0
- 1
- 3
Can't access to the trianed model files.
#261 opened by TigerVersusT - 0
New parallel PyTorchRL library based on this one
#267 opened by giadefa - 0
- 2
enjoy.py failes. Unexpected argument 'ret'
#263 opened by jakefoster954 - 2
assert 'NoFrameskip' in env.spec.id
#253 opened by liuqi8827 - 1
Does setting the flag "use-proper-time-limits" to be True recommended for all gym environments with time limit?
#259 opened by PeixinC - 1
Unable to run enjoy.py
#262 opened by jakefoster954 - 1
Wrong continues actions
#245 opened by oroojlooy - 0
PPO Not Converge for Pendulum-v0
#260 opened by ZhizhenQin - 0
- 0
- 0
adaptive adam learning rate
#252 opened by a-z-e-r-i-l-a - 0
should h5py be listed as dependency?
#251 opened by suliuzh - 2
GAIL uses AIRL reward function
#236 opened by HareshKarnan - 0
What can compute_grad_pen in gail.py do?
#250 opened by ruleGreen - 0
EOFError when entering a subprocess worker
#249 opened by Artimisu - 0
How to run this examples without tensorflow?
#247 opened by jonndoe - 0
Make pretrained models available for Atari Games
#246 opened by asaran - 0
Mujoco Reacher-v2 fails to train
#244 opened by oroojlooy - 1
Usage of gradient penalty without Wasserstein Loss
#240 opened by mayankg95 - 1
the usage of after_update in rollout storage
#243 opened by jiangsy - 0
- 0
- 0
- 3
FPS calculation
#238 opened by Xemnas0 - 0
Insert obs, action in storge (PPO)
#235 opened by mynsng