Question on processing the expert data
Closed this issue · 1 comments
Tony-chenjw commented
Hi, I have a question on the code here:
Why do we need this padding? Wouldn't it mess up the order when we select the subset from the expert data?
By the way, I am runningcheetah_state.sh
.
Could you explain it? Thanks a lot.
zhaohengyin commented
We followed some conventions used in other open MuZero repositories which pad the action history at the beginning. This makes loss computation in trainer slightly cleaner. We apologize for any confusion. Thanks!