ajlangley/cpo-pytorch
An implementation of Constrained Policy Optimization (Achiam 2017) in PyTorch
Python
Issues
- 5
mean kl is always=0
#10 opened by xzhang2523 - 1
"from envs.ant_gather import AntGatherEnv"
#7 opened by xzhang2523 - 1
a "bug"? in the cpo method
#8 opened by xzhang2523 - 1
line 2 lead to imp_sampling=1
#9 opened by xzhang2523 - 3
- 9
Does it converge?
#6 opened by Bigpig4396 - 1
Where can i find the AntGather env?
#4 opened by DZ9 - 1
Nice work
#1 opened by xiaoyuanzh - 1
Some questions about the codes
#2 opened by Baiyu6666 - 1
[question] How to turn my custom environment into an environment suitable for CPO?
#3 opened by kosmylo