boyu-ai/Hands-on-RL

2.4公式错误

Opened this issue · 0 comments

https://hrl.boyuai.com/chapter/1/%E5%A4%9A%E8%87%82%E8%80%81%E8%99%8E%E6%9C%BA#24-%CF%B5-%E8%B4%AA%E5%BF%83%E7%AE%97%E6%B3%95

image

应是动作$a_t$,$\underset{a \in \mathcal{A}}{\operatorname{argmax}} \hat{Q}_t(a)$

$$ a_{t}= \begin{cases}{\operatorname{argmax}_{a \in \mathcal{A}}} \hat{Q}_t(a), & \text { 采样概率: } 1-\epsilon \\ \text { 从 } \mathcal{A} \text { 中随机选择, } & \text { 采样概率: } \epsilon\end{cases} $$

更多可参考我项目:https://github.com/StevenJokess/d2rl/blob/master/MAB.md
QQ群交个朋友:171097552
付款表达感谢:
image