Questions about gradLM

Question

Questions about gradLM

Closed this issue 2 years ago · 4 comments

Thank you for impressive work and sharing the code. I have run it successfully but I have some questions about the details of gradICP using gradLM.

I think the residual |Ax-b|^2 defined in gauss_newton_solve is a linear approximation to point-to-line ICP and form a linear least squares. My question is that the problem can be directly solved by normal equations rather than LM solver. Why add dampling term here?
x is updated by a smooth gating function in iterate process. I guess the corresponding code is in icputils.py, line 538:

        sigmoid = 1 / ((1 + torch.exp(-B2 * errdiff)) ** (1 / nu))
        residual_transform = se3_exp(sigmoid * xi)
        src_pc = transform_pointcloud(src_pc[0], residual_transform).unsqueeze(0)

the sigmoid function is different from paper, adding ** (1 / nu) . I try to set nu=1 but get totally wrong result. errors and sigmoids in iterations are:

        Num 0 iteration
        current err is 236.284622, new err is 41.130245, sigmoid = 0.000000
        Num 1 iteration
        current err is 236.284622, new err is 41.130245, sigmoid = 0.000000

Sigmoid funtion seems to reject better lookahead error.

Answer 1 · 2021-03-04T14:19:47.000Z

@MisEty all good questions!

The reason we use LM (as opposed to solving the original normal equations) is because the approximate Hessian A = J^T J can often turn out to be near-singular. The damping term ensures better numerical conditioning by adding a small positive value to the diagonal elements of A.
That is correct. This implementation deviates slightly from the paper in that it uses a more general form for the logistic function (the expression shown on https://en.wikipedia.org/wiki/Generalised_logistic_function). Setting nu = 1 should have worked, imo. While it is hard to infer the underlying behavior from the snapshot you posted, I think this might have to do with the initial guesses of the parameters of the sigmoid (typically we initialize them to be conservative and not accepts updates at the beginning, as opposed to being eager and accepting multiple iterates).

Answer 2 · 2021-03-05T04:07:28.000Z

Thanks for your answers! But I am still confused about the details.

I see A defined in gauss_newton_solve is

    A = torch.cat(
        [nx, ny, nz, nz * sy - ny * sz, nx * sz - nz * sx, ny * sx - nx * sy], 1
    )

Is it equal to the Hessian matrix in point-to-plane ICP problem?
2. I think in gradLM, transform is update by :

        sigmoid = 1 / ((1 + torch.exp(-B2 * errdiff)) ** (1 / nu)) 
        residual_transform = se3_exp(sigmoid * xi)
        transform = torch.mm(residual_transform, transform)

I set nu =1, and B2 =1 is the default value, so current sigmoid is logistic function, a monofonically increasing function. But when errdiff decreasing, it means that xi is a better update so sigmoid should be bigger to accept the residual transform. I guess this makes the bug I faced. And I notice that set B2=-1 can solve the problem:

# I set nu=1 and B2=-1, now sigmoid is a decreasing function
Num 0 iteration
current err is 236.284622, new err is 41.130245, sigmoid = 1.000000
Num 1 iteration
current err is 41.130245, new err is 36.040936, sigmoid = 0.993875
Num 2 iteration
current err is 36.045734, new err is 36.218819, sigmoid = 0.456837
Num 3 iteration
current err is 36.169807, new err is 36.319458, sigmoid = 0.462657

Answer 3 · 2021-03-05T04:13:31.000Z

A is the Jacobian -- the Hessian would infact be the matrix of second derivatives, but since that's too cumbersome to compute, people approximate the hessian by A^T A.
This needs more looking into (thanks for posting the log!)

Answer 4 · 2022-12-20T23:41:18.000Z

Closing due to inactivity