danaugrs/huskarl

REINFORCE

Opened this issue · 2 comments

What about REINFORCE algorithm?

I'll work on it after Prioritized Experience Replay! Will probably be a couple weeks since I'm taking my time re-reading the PER paper and figuring out the most flexible implementation.

Thank you!