Question about Varifocal loss
HAOCHENYE opened this issue · 7 comments
In the paper, the negtive weight of BCE loss is alpha*p^gamma. However, in varifocal_loss.py, the loss is implemented by:
focal_weight = target * (target > 0.0).float() +
alpha * (pred_sigmoid - target).abs().pow(gamma) *
(target <= 0.0).float()
The negtive weight is alpha(p-q)^gamma*, why?
This is the initial version of implementation of VFL and I forgot to refine it.
alpha * (pred_sigmoid - target).abs().pow(gamma) * (target <= 0.0).float()
actually equals to alpha * pred_sigmoid.pow(gamma) * (target == 0.0).float()
, because there is a multiplier (target <= 0.0).float()
in that formula and the target is always >= 0.
This is the initial version of implementation of VFL and I forgot to refine it.
alpha * (pred_sigmoid - target).abs().pow(gamma) * (target <= 0.0).float()
actually equals toalpha * pred_sigmoid.pow(gamma) * (target == 0.0).float()
, because there is a multiplier(target <= 0.0).float()
in that formula and the target is always >= 0.
You means alpha * pred_sigmoid.abs().pow(gamma) * (target <= 0.0).float()
equals alpha * pred_sigmoid.pow(gamma) * (target == 0.0).float()
or alpha * (pred_sigmoid - target).abs().pow(gamma) * (target <= 0.0).float()
equals to alpha * pred_sigmoid.pow(gamma) * (target == 0.0).float()
? I'd understand the situation if it is the former one.
According to paper, the negtive weight should be alpha * pred_sigmoid.abs().pow(gamma) * (target <= 0.0).float()
.Is the formular of paper current version?
Hi, target
is the IoU so it is always >= 0, which implies target <= 0
<=> target == 0
.
In this way,
alpha * (pred_sigmoid - target).abs().pow(gamma) * (target <= 0.0).float()
<=>
alpha * (pred_sigmoid - target).abs().pow(gamma) * (target == 0.0).float()
<=>
alpha * pred_sigmoid.abs().pow(gamma) * (target == 0.0).float()
.
Ohhh! Thanks, I understand it now.
Hi @hyz-xmaster ,
- I did not find the
q
in the red circle according to the code. - I can't understand the item above the green line. Since
log(1-p)
is used to predict negative samples, why it appears in theq>0
case? And anyway, I did not find the related implementation from the code. I just understand the code by the following way:
Looking forward to your reply, thanks.
Hi @feiyuhuahuo,
target
in the code representsq
in that formula.qlog(p)+(1-q)log(1-p)
is the binary cross entropy loss, which is calculated by F.binary_cross_entropy_with_logits. Whenq = 0
,qlog(p)+(1-q)log(1-p)
reduces tolog(1-p)
. Whenq > 0
, it keeps unchanged.
This is the initial version of implementation of VFL and I forgot to refine it.
alpha * (pred_sigmoid - target).abs().pow(gamma) * (target <= 0.0).float()
actually equals toalpha * pred_sigmoid.pow(gamma) * (target == 0.0).float()
, because there is a multiplier(target <= 0.0).float()
in that formula and the target is always >= 0.You means
alpha * pred_sigmoid.abs().pow(gamma) * (target <= 0.0).float()
equalsalpha * pred_sigmoid.pow(gamma) * (target == 0.0).float()
oralpha * (pred_sigmoid - target).abs().pow(gamma) * (target <= 0.0).float()
equals toalpha * pred_sigmoid.pow(gamma) * (target == 0.0).float()
? I'd understand the situation if it is the former one.According to paper, the negtive weight should be
alpha * pred_sigmoid.abs().pow(gamma) * (target <= 0.0).float()
.Is the formular of paper current version?
Hello, did you add your loss to yolov5? Judge which place needs to be adjusted?