navneet-nmk/kiso

Notes

theory-with-papers

2018-07

The Mirage of Action-Dependent Baselines in Reinforcement Learning [paper] [notes]

2018-06

Gradient Estimation Using Stochastic Computation Graphs [paper] [notes]
Backpropagation through the Void: Optimizing control variates for black-box gradient estimation [paper] [notes]