Gradient of the DenseCRF loss

Question

Gradient of the DenseCRF loss

Closed this issue 6 years ago · 4 comments

Hi,
I'm very interested at your work and want to follow your paper in ECCV2018. I notice that in the paper the DenseCRF loss is
,
while in the code it is

In both the code and the paper, its loss is computed as
.
However, I think the gradient should be

.
Why is there a difference between the implementation and the theory? Should the first term of the gradient be ignored?

Answer 1 · 2019-02-04T00:29:43.000Z

same doubt, do you figure it out ?

Answer 2 · 2019-02-04T06:00:24.000Z

When summing over k for the first term, it becomes a constant no matter whether S is discrete or continuous. So I chose to ignore the first term.

Answer 3 · 2020-01-09T15:00:23.000Z

@meng-tang Hi, I get confused. Since W is generated using Gaussian kernel, so every item in W is positive. S_k is the softmax output, so S_k is also positive. Then the gradient is always negative? So how can the gradient descent work?

Answer 4 · 2020-01-09T16:13:45.000Z

The loss just keep increasing......