uber-research/learning-to-reweight-examples

gradients of noisy loss w.r.t parameter \theta

qingerVT opened this issue · 3 comments

The basic procedure sounds like

  • (a) set \epsilon = 0, get grads to update \theta
  • (b) evaluate new \theta on clean validation set .
  • (c) set - grads (validation loss w.r.t \epsilon) as \epsilon
  • (d) use new \epsilon to evaluate noisy data and update parameters again

My question is when \epsilon=0, the derivative of loss w.r.t. \theta =0? That means we don't update \theta actually in (a)?

Sorry, got it!

@qingerVT So what is the answer? I also get confused about it recently.