jaromiru/AI-blog

annealing bias

zacwellmer opened this issue · 2 comments

I could be wrong but it does not seem that you are annealing the bias with important sampling as suggested in the PER paper(section 3.4).

w_i = (1/N * 1/P(i))^beta

I think you would have to multiply this w_i term with your gradients

Confirmed.

I've got a repo where I implemented the importance sampling for ddpg PER. I'm still unsure if it works. But lines 68 - 91 might be useful for you. It's pretty slow, but I still haven't figured out a faster way.

Would be interesting to see how not using importance sampling would effect your results.