A question about critic-loss in discrete sac?
outshine-J opened this issue · 5 comments
outshine-J commented
I applied the code of discrete sac to a custom discrete action environment. During the training process, I found that the loss of critic did not decrease but increased, and the critic-loss value after the increase was very large, even reaching 200+, what is the problem? Caused, how can I fix it? thanks.
fry404006308 commented
您好,我是范仁义,您的邮件我已经收到,我会尽快处理,谢谢。
outshine-J commented
Added, the same happens even if I crop the reward.
Mengyu-Messic commented
@outshine-J
Hello, I have encountered the same problem, have you solved it?
fry404006308 commented
您好,我是范仁义,您的邮件我已经收到,我会尽快处理,谢谢。
outshine-J commented
@Mengyu-Messic
You can find the answer by following the link. toshikwa/sac-discrete.pytorch#12 (comment). Other than that you can change this by setting a fixed temperature.