A question about critic-loss in discrete sac？

Question

A question about critic-loss in discrete sac？

outshine-J opened this issue 2 years ago · 5 comments

I applied the code of discrete sac to a custom discrete action environment. During the training process, I found that the loss of critic did not decrease but increased, and the critic-loss value after the increase was very large, even reaching 200+, what is the problem? Caused, how can I fix it? thanks.

Answer 1 · 2022-10-08T07:05:50.000Z

您好，我是范仁义，您的邮件我已经收到，我会尽快处理，谢谢。

Answer 2 · 2022-10-08T07:28:24.000Z

Added, the same happens even if I crop the reward.

Answer 3 · 2022-11-04T08:45:36.000Z

@outshine-J
Hello, I have encountered the same problem, have you solved it?

Answer 4 · 2022-11-04T08:45:53.000Z

您好，我是范仁义，您的邮件我已经收到，我会尽快处理，谢谢。

Answer 5 · 2022-11-04T09:01:40.000Z

@Mengyu-Messic
You can find the answer by following the link. toshikwa/sac-discrete.pytorch#12 (comment). Other than that you can change this by setting a fixed temperature.