understanding alpha learning

Hi, there,
I am confused about how alpha learning is done here:

Line 244 in 222537f

    
           alpha_loss = - (self.log_alpha.cpu() * (log_pis.cpu() + self.target_entropy).detach().cpu()).mean()

I thought line 244 here should use alpha instead of self.log_alpha to compute alpha_loss, the dependency goes like: self.log_alpha --> alpha --> alpha_loss, so that ADAM will optimize self.log_alpha automatically for us.

Thanks.

Shuang

you are right, thanks for mentioning it! i just updated the code