Wrong gradient flow in bias correction term of ACER?
wwiiiii opened this issue · 1 comments
wwiiiii commented
Line 104 in 46f9b32
According to original paper, gradient for bias correction term is define as below,
and as pi
serves as the probability for expectation calculation, it seems it's not the target of optimization.
Shouldn't we detach the pi
from computational graph at above line?
seungeunrho commented
Wow, you're correct.
Thanks for such a sharp comment.
I updated the code.