pranz24/pytorch-soft-actor-critic

PyTorch implementation of soft actor critic

PythonMIT

Issues

Is this code SAC-V not a SAC?
#43 opened a year ago by night2570
1
[Question] Mask Batch
#42 opened a year ago by chenxi-yang
2
Question: Why optimize loss_alpha?
#47 opened a year ago by DefinitlyEvil
0
Training policy for more complex tasks, converges to sub-optimal solutions
#45 opened a year ago by rosa-wolf
0
No normalization of state space
#46 opened a year ago by rosa-wolf
0
the bound enforce for log_prob in line 103 of model.py
#44 opened a year ago by Roboticyang
0
Model saving and loading
#41 opened 2 years ago by tissten
1
what is the derivation behind the log_prob equation?
#29 opened 5 years ago by FernandoCamaro
1
Could you please explain the "# Enforcing Action Bound" comment?
#18 opened 5 years ago by Wen-Wen-Luffy
3
Running SAC: Operation failed to compute its gradient
#31 opened 5 years ago by ian-cannon
3
question about q_loss and alpha_loss
#40 opened 2 years ago by xxxkxin
0
Doubts about Regularization in policy loss
#39 opened 3 years ago by Marxvans
0
Resume training
#35 opened 4 years ago by Tomeu7
5
Unable to reproduce results on Humanoid-v2 in new SAC
#16 opened 5 years ago by zwfightzw
6
Exploding entropy temperature
#34 opened 4 years ago by reubenwong97
10
multiplying action_scale in the log_prob computation
#33 opened 4 years ago by sungsulim
1
does your code support multi-dimension discrete action space?
#23 opened 4 years ago by KK666-AI
1
Inconsistent seeding
#32 opened 4 years ago by mohakbhardwaj
2
Action scale and action bias
#24 opened 5 years ago by shakenov-chinga
1
Support OpenAI Gym Robotic Env?
#30 opened 5 years ago by peiseng
0
Target value calculation maistake
#25 opened 5 years ago by alirezakazemipour
4
Action scaling is missing on SAC_V branch
#27 opened 5 years ago by alirezakazemipour
4
Can I use this in custom gym env?
#26 opened 5 years ago by kwk2696
1
Policy Loss with Minimum or Q1?
#3 opened 6 years ago by pranv
4
Why is the value function not used in this implementation?
#22 opened 5 years ago by Steven-Ho
4
puzzles about action scaling
#20 opened 5 years ago by wayunderfoot
2
Question about policy_loss
#14 opened 5 years ago by toshikwa
11
About model.py line 105
#17 opened 5 years ago by BangLiu
3
Derivative in reparametrization trick?
#11 opened 6 years ago by ZeratuuLL
9
Value network
#9 opened 6 years ago by jendelel
4
Normalized Actions has bugs
#12 opened 6 years ago by Phlogiston90
3
reproducibility for HalfCheetah-v2
#4 opened 6 years ago by tldoan
5
A little question about calculating log likelihood
#10 opened 6 years ago by ZeratuuLL
1
Why do you need to use NormalizedActions()?
#6 opened 6 years ago by JingJerry
4
reparametrization trick issue
#5 opened 6 years ago by tldoan
10
A question in the deterministic case
#2 opened 6 years ago by roosephu
3
Images
#1 opened 6 years ago by pranz24
1