sweetice/Deep-reinforcement-learning-with-pytorch

Temperature factor missing in SAC !!!

Opened this issue · 1 comments

log_prob should be multiplied by temperature factor (alpha) when calculating pi_loss in ALL implementations of SAC.

Also, the output of "log_std_head" layer in Actor network in SAC is no need to go through ReLu, because what we need is the LOG of std instead of std value.