understanding alpha learning
Closed this issue · 1 comments
shuangwu commented
Hi, there,
I am confused about how alpha learning is done here:
Soft-Actor-Critic-and-Extensions/SAC.py
Line 244 in 222537f
I thought line 244 here should use alpha instead of self.log_alpha to compute alpha_loss, the dependency goes like: self.log_alpha --> alpha --> alpha_loss, so that ADAM will optimize self.log_alpha automatically for us.
Thanks.
Shuang
BY571 commented
you are right, thanks for mentioning it! i just updated the code