annealing bias

Question

annealing bias

zacwellmer opened this issue 7 years ago · 2 comments

I could be wrong but it does not seem that you are annealing the bias with important sampling as suggested in the PER paper(section 3.4).

w_i = (1/N * 1/P(i))^beta

I think you would have to multiply this w_i term with your gradients

Answer 1 · 2017-11-10T11:12:38.000Z

Confirmed.

Answer 2 · 2017-11-10T14:03:35.000Z

I've got a repo where I implemented the importance sampling for ddpg PER. I'm still unsure if it works. But lines 68 - 91 might be useful for you. It's pretty slow, but I still haven't figured out a faster way.

Would be interesting to see how not using importance sampling would effect your results.