Khrylx/PyTorch-RL
PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.
PythonMIT
Issues
- 1
- 3
Fail to train of GAIL in Ant-v2 environment
#28 opened by seolhokim - 1
TRPO,Is fixed_log_probs the same as log_probs
#35 opened by yongpan0715 - 0
Is the implelented performance comparable with the results provided in original GAIL paper?
#34 opened by huang-fuxian - 1
- 1
Various questions?
#26 opened by lviano - 0
A question bout PPO implementation
#33 opened by pengzhi1998 - 0
About computing Hessian*vector
#32 opened by jjjhfffjj - 0
- 0
- 6
Implementation problem
#27 opened by pengzhi1998 - 4
- 0
Mountain Car
#24 opened by jpark0315 - 1
Question on multiprocessing
#22 opened by pengzhi1998 - 2
Doubt regarding the calculation of advantage
#23 opened by nesarasr - 3
about the kl
#21 opened by yangyiqin-tsinghua - 1
is this an error:num_steps += (t + 1) ?
#20 opened by pprivulet - 4
- 0
question about A2C
#17 opened by kishanpb - 0
Confusion about advantage computation
#16 opened by gunshi - 0
- 1
- 2
question about weight init
#14 opened by gunshi - 1
Not able to run the TRPO example on GPU
#12 opened by avijit9 - 1
TRPO: KL Divergence Computation
#11 opened by sandeepnRES - 4
Training a recurrent policy
#4 opened by erschmidt - 1
Few Runtime errors
#10 opened by sandeepnRES - 2
Entropy Term for GAIL
#9 opened by sandeepnRES - 1
CNN Policy
#8 opened by bbalaji-ucsd - 2
result is not good
#7 opened - 2
- 1
- 3
Autograd Import Error
#3 opened by aseembits93 - 2
Memory leak during GPU training
#2 opened by erschmidt - 4
CudnnRNN is not differentiable twice
#1 opened by erschmidt