[Question] How was the target entropy in the discrete SAC chosen?
aivarsoo opened this issue · 0 comments
aivarsoo commented
Hello! I have a question on the discrete SAC design.
What was the reasoning for choosing the target entropy in the discrete SAC? If I understand correctly the target entropy represents the ideal entropy of the optimal policy. If so why it is -0.98 * log( 1 / |A|)
?