Misunderstanding in chapter 2
zZthebreakerZz opened this issue · 1 comments
zZthebreakerZz commented
Hi, everyone. I have read the book but I still don't know why we need to add "true_reward" to "self.q_true". The book only said that they just used a normal distribution for creating q_true. Can someone explain to me details, please? Thank you!
ShangtongZhang commented
I remember q_true is the mean of the Gaussian.