quantumiracle/Popular-RL-Algorithms

PyTorch implementation of Soft Actor-Critic (SAC), Twin Delayed DDPG (TD3), Actor-Critic (AC/A2C), Proximal Policy Optimization (PPO), QT-Opt, PointNet..

Jupyter NotebookApache-2.0

Issues

Environment setting about python version
#81 opened a month ago by vickychen928
0
“expected sequence of length 4 at dim 1”when running dqn.py
#80 opened 2 months ago by githubfsfs
2
About RL+LSTM
#67 opened 2 years ago by 4359hhh
1
Why does PPO every training result in the same reward chart? This puzzles me very much.
#48 opened 3 years ago by Alexzzdfjcn
6
ValueError on SAC v2 LSTM
#34 opened 3 years ago by sarmientoj24
3
please evaluate the performance over multiple seeds
#47 opened 3 years ago by thlautenschlaeger
1
Error：ppo_gae_discrete.py
#45 opened 3 years ago by lucifer2859
1
NameError: name 'last_action' is not defined
#42 opened 3 years ago by Nick-Kou
1
FileNotFoundError: [Errno 2] No such file or directory: './model/sac'
#41 opened 3 years ago by Alexzzdfjcn
3
How do I adjust SAC if I have a continuous action space that is more than -1 and 1
#33 opened 3 years ago by sarmientoj24
1
RDPG runs on MDP domains?
#31 opened 3 years ago by hai-h-nguyen
3
Missing folders
#30 opened 3 years ago by hai-h-nguyen
1
Random actions at the beginning for recurrent policy
#23 opened 3 years ago by hai-h-nguyen
1
Does sac_v2_lstm support Pendulum-v0?
#19 opened 3 years ago by zhaoguangyuan123
1
Variable length episodes
#20 opened 3 years ago by alanmackey
1
I think "with torch.no_grad():" is needed when calculating critic loss
#16 opened 4 years ago by dbsxdbsx
2
Stochastic Action sample seems not right to me
#11 opened 4 years ago by BigWZhu
1
Issue in test mode of 'sac_v2_gru.py'
#14 opened 4 years ago by hynkis
1
why input last action to lstm policy network
#7 opened 4 years ago by junhuang-ifast
1
Parameters copying
#2 opened 5 years ago by manuelsh
1