idstcv/ZenNAS

Two questions about the computation of zero score

carabnuu opened this issue · 5 comments

Two questions about the computation of zero score

hi , the work is excellent! I'm curious and have two questions about computation of zen score.

  1. You computed the Frobenius norm in the original paper , while for the code in 'compute_zen_score.py' zen score is computed by 'torch.abs(output - mixup_output)'
  2. and I don't know why should we get the differential in this step ( refers to step 4 in algorithm 1). I found it's mentioned as 'This step replaces the gradient of x with finite differen�tial ∆ to avoid backward-propagation.' in the original paper , can you give more explanation?
    thanks a lot!!

Hi carabnuu,
Thanks for feedback!

  1. The gradient norm can be approximated by numerical differential. So we can use norm(f(x1)-f(x2)) / norm(x1-x2) to replace gradient norm
  2. You can use any norm function actually. In our code we use L1-norm (abs) because it is faster to compute. Feel free to use L2-norm as in the paper, or any Lp-norm you would like.

thanks for your answering!That really helps a lot!
but I still have a question about your answer 1 regrading step 4 in algorithm 1.
This step uses numerical differential to replace the gradient of input x. But from the expression in the paper and 'torch.abs(output - mixup_output)' in the code as well,I can only see Lp-norm(f(x1)-f(x2)) , and don't have the denominator part norm(x1-x2). I don't know why. I might be stuck by this simple question.
Thanks again!

norm(x1-x2) is nearly a constant because x2 = x1 + epsilon. In high dimentional space, the norm of a random Gaussian is nearly a constant.

Thank you for your detailed reply !