p-christ/Deep-Reinforcement-Learning-Algorithms-with-PyTorch

Question on SAC implementation

fokx opened this issue 3 years ago · 0 comments

fokx commented 3 years ago

In SAC.py Line 120

Deep-Reinforcement-Learning-Algorithms-with-PyTorch/agents/actor_critic_agents/SAC.py

Line 120 in b338c87

_, z, action = self.produce_action_and_action_info(state)

However, the output of produce_action_and_action_info(state) is

Deep-Reinforcement-Learning-Algorithms-with-PyTorch/agents/actor_critic_agents/SAC.py

Line 135 in b338c87

return action, log_prob, torch.tanh(mean)

So, even though SAC algorithm can work in practice, is it a mistake?