theory-with-papers 2018-07 The Mirage of Action-Dependent Baselines in Reinforcement Learning [paper] [notes] 2018-06 Gradient Estimation Using Stochastic Computation Graphs [paper] [notes] Backpropagation through the Void: Optimizing control variates for black-box gradient estimation [paper] [notes]