meng-tang/rloss

Gradient of the DenseCRF loss

Closed this issue · 4 comments

Hi,
I'm very interested at your work and want to follow your paper in ECCV2018. I notice that in the paper the DenseCRF loss is
image,
while in the code it is
image
In both the code and the paper, its loss is computed as
image.
However, I think the gradient should be
image
image.
Why is there a difference between the implementation and the theory? Should the first term of the gradient be ignored?

same doubt, do you figure it out ?

When summing over k for the first term, it becomes a constant no matter whether S is discrete or continuous. So I chose to ignore the first term.

@meng-tang Hi, I get confused. Since W is generated using Gaussian kernel, so every item in W is positive. S_k is the softmax output, so S_k is also positive. Then the gradient is always negative? So how can the gradient descent work?

The loss just keep increasing......