theory-with-papers

2018-07

  • The Mirage of Action-Dependent Baselines in Reinforcement Learning [paper] [notes]

2018-06

  • Gradient Estimation Using Stochastic Computation Graphs [paper] [notes]
  • Backpropagation through the Void: Optimizing control variates for black-box gradient estimation [paper] [notes]