Questions about gradLM
Closed this issue · 4 comments
Thank you for impressive work and sharing the code. I have run it successfully but I have some questions about the details of gradICP using gradLM.
- I think the residual |Ax-b|^2 defined in
gauss_newton_solve
is a linear approximation to point-to-line ICP and form a linear least squares. My question is that the problem can be directly solved by normal equations rather than LM solver. Why add dampling term here? x
is updated by a smooth gating function in iterate process. I guess the corresponding code is in icputils.py, line 538:
sigmoid = 1 / ((1 + torch.exp(-B2 * errdiff)) ** (1 / nu))
residual_transform = se3_exp(sigmoid * xi)
src_pc = transform_pointcloud(src_pc[0], residual_transform).unsqueeze(0)
the sigmoid function is different from paper, adding ** (1 / nu)
. I try to set nu=1 but get totally wrong result. errors and sigmoids in iterations are:
Num 0 iteration
current err is 236.284622, new err is 41.130245, sigmoid = 0.000000
Num 1 iteration
current err is 236.284622, new err is 41.130245, sigmoid = 0.000000
Sigmoid funtion seems to reject better lookahead error.
@MisEty all good questions!
- The reason we use LM (as opposed to solving the original normal equations) is because the approximate Hessian
A = J^T J
can often turn out to be near-singular. The damping term ensures better numerical conditioning by adding a small positive value to the diagonal elements ofA
. - That is correct. This implementation deviates slightly from the paper in that it uses a more general form for the logistic function (the expression shown on https://en.wikipedia.org/wiki/Generalised_logistic_function). Setting
nu = 1
should have worked, imo. While it is hard to infer the underlying behavior from the snapshot you posted, I think this might have to do with the initial guesses of the parameters of the sigmoid (typically we initialize them to be conservative and not accepts updates at the beginning, as opposed to being eager and accepting multiple iterates).
Thanks for your answers! But I am still confused about the details.
- I see
A
defined ingauss_newton_solve
is
A = torch.cat(
[nx, ny, nz, nz * sy - ny * sz, nx * sz - nz * sx, ny * sx - nx * sy], 1
)
Is it equal to the Hessian matrix in point-to-plane ICP problem?
2. I think in gradLM, transform
is update by :
sigmoid = 1 / ((1 + torch.exp(-B2 * errdiff)) ** (1 / nu))
residual_transform = se3_exp(sigmoid * xi)
transform = torch.mm(residual_transform, transform)
I set nu =1, and B2 =1 is the default value, so current sigmoid is logistic function, a monofonically increasing function. But when errdiff decreasing, it means that xi
is a better update so sigmoid should be bigger to accept the residual transform. I guess this makes the bug I faced. And I notice that set B2=-1 can solve the problem:
# I set nu=1 and B2=-1, now sigmoid is a decreasing function
Num 0 iteration
current err is 236.284622, new err is 41.130245, sigmoid = 1.000000
Num 1 iteration
current err is 41.130245, new err is 36.040936, sigmoid = 0.993875
Num 2 iteration
current err is 36.045734, new err is 36.218819, sigmoid = 0.456837
Num 3 iteration
current err is 36.169807, new err is 36.319458, sigmoid = 0.462657
A
is the Jacobian -- the Hessian would infact be the matrix of second derivatives, but since that's too cumbersome to compute, people approximate the hessian byA^T A
.- This needs more looking into (thanks for posting the log!)
Closing due to inactivity