p-christ/Deep-Reinforcement-Learning-Algorithms-with-PyTorch

SAC Discrete needs it's own `calculate_entropy_tuning_losses` function?

Harimus opened this issue · 0 comments

So while checking the SAC_Discrete code I noticed the lack of calculate_entropy_tuning_losses function, which it inherit from SAC.

But according the SAC_Discrete paper equation 11 vs 9 (latter is for continuous SAC), for the discrete case, the Estimate E is rather taken by weighting the -alpha*(log_pi + target_entropy) with the probability of each action by the agent. ( pi), and not by sampling one log_pi.

Shouldn't SAC_Discrete have it's own entropy loss function then?