ShangtongZhang/reinforcement-learning-an-introduction

Misunderstanding in chapter 2

zZthebreakerZz opened this issue · 1 comments

Hi, everyone. I have read the book but I still don't know why we need to add "true_reward" to "self.q_true". The book only said that they just used a normal distribution for creating q_true. Can someone explain to me details, please? Thank you!

I remember q_true is the mean of the Gaussian.