annealing bias
zacwellmer opened this issue · 2 comments
zacwellmer commented
I could be wrong but it does not seem that you are annealing the bias with important sampling as suggested in the PER paper(section 3.4).
w_i = (1/N * 1/P(i))^beta
I think you would have to multiply this w_i term with your gradients
jaromiru commented
Confirmed.
zacwellmer commented
I've got a repo where I implemented the importance sampling for ddpg PER. I'm still unsure if it works. But lines 68 - 91 might be useful for you. It's pretty slow, but I still haven't figured out a faster way.
Would be interesting to see how not using importance sampling would effect your results.