Algomorph/InfiniTAM

Some kind of bug/problem with Killing regularizer

Algomorph opened this issue · 3 comments

The Killing regularizer seems to grow out of proportion with every iteration, whether or not truncated voxels are included in the computation.

Dear,
I wanted to discuss this issue. I have done complete KillingFusion(opensource) but had most problems with Killing Energy.

There are two main points:-
a. KillingFusion by itself cannot register two away scenes, i.e. canonical and current Snoopy(toy by Mira Slavcheva) too far away. Here, data energy will get no gradients as the SDF of canonical will have same initial values.

b. When working on Killing Energy:-
Problem:- The data term energy works by itself. When turning on the motion regularizer, the displacement vectors are much smoother, i.e. all voxels move together. However, balancing the motion regularizer energy and data energy using weights is extremely difficult.

Possible Cause:- The way gradients is computed and its behaviour for voxels at periphery.
Motion regularizer(smoothness constraint) is supposed to make sure that all voxels move together in same fashion. This means, that if a voxel has displacement vector 0, its neighboring voxels should also stay still.

The figure below shows an object in green, with the blue showing voxel space covering the object which should be processed. The grid of voxels drawn on the right shows voxels with 2 different colors: green and red. The green voxels lies inside the truncated space and are being processed, while voxels in red space are ignored. Here, I will focus on voxel V at periphery, i.e. one with green color and solid black border.

When computing gradient for motion regularization, I am using central difference, which checks value both for voxel on the left and the right. This means that for voxel V, I am using displacement vector 0,0,0 from voxel on the right, which is not ideal as no data term energy is applied on that voxel.

image

Next:- If this is a probable cause for difficulty in balancing energies, then it is not limited to only voxels at periphery. The affect of zero displacement of ignored voxels on periphery voxels would in further iterations affect voxels near the surface adjacent to voxel V and so on, causing a chain effect.

I hope I presented the issue clearly and in case if anything is unclear, please let me know. I would be very obliged if you could provide your thoughts on this.

@saurabheights Thanks for reaching out.

b. Yes, sounds very much like what I encountered playing around with the algorithm in the past.
I remember what really did the trick for me at some point was marking up all the voxels as "known" and "unknown" (an extra boolean field), and only including the "known" voxels in the computation for any of the terms, i.e. they would be disregarded in computation for their "known" neighbors' motion. This may have certain side-effects, like the voxels on the border of the SDF being dragged slower (like you said, using central differences in this case essentially eliminates 1/2 the force that would have come from the "unknown" side).

Then I had some back-and-forth emailing with Mira about this and some other issues. She said they don't have any checks whatsoever about known/unknown, but then, in another email, contradicted herself, in a way, saying that certain voxels were "ignored" during computation. I reread all the emails many times, after which I decided "to hell with this" and that I'm going to do my own thing -- whatever I find works.

a. The way they do it, really, is kind-of misleading, in my view. They don't tell you that you can't register two objects when they are far away with the dynamic alignment, a.k.a. the part of Killing- or SobolevFusion optimization that evolves the "live" SDF to the "canonical". What they do is they mask their objects, so that they can use a rigid alignment algorithm (they use their SDF-2-SDF fusion) to register them as closely as possible in an ICP-like fashion first, then they run the non-rigid alignment part.

If you look at their papers and CVPR presentation and check every single example, you'll find that, with exception of some very simple surfaces, like a ball or a hat, all objects either don't move move much relative to each other (gesticulating human figures), or just happens to be the only object in the scene with the background completely subtracted (Snoopy, Duck, etc.). There are some shots with Snoopy standing on the table and flapping his ears in the KillingFusion paper -- notice that they use the frames where Snoopy doesn't move around for this. All this eliminates the need to match to geometry the previous frame -- which they don't do: they just evolve the live TSDF all the way to the original canonical TSDF.

That's an easy way to present this as if it works for any dynamic scene, when, in reality, it can't really work even for the Snoopy walking around on the table -- the rigid fusion component would match the "future" table to the "original" table, leaving the "future" Snoopy too far away from the "original" Snoopy for the non-rigid alignment to work properly -- the data term would have simply no information to go on.

This is why I'm currently devising a scheme where the "live" TSDF is aligned to the "previous"-frame state rather the canonical.

@Algomorph : Thank you for your quick reply.

a.

They don't tell you that you can't register two objects when they are far away with the dynamic alignment, a.k.a. the part of Killing- or SobolevFusion optimization that evolves the "live" SDF to the "canonical".
Yes, they could have been more explicit of this issue in 3.2. Rigid Component of the Motion* of KillingFusion paper. For now, I can just do this using Procrustes or ICP algorithm.

Regarding multiple motions(table and Snoopy), I mostly agree, but this is research(& IMO, a good one, albeit on paper). As of now, I don't see any way for it to perform well on multiple objects. Especially, since there are no proper correspondences.

This is why I'm currently devising a scheme where the "live" TSDF is aligned to the "previous"-frame state rather the canonical.
I wanted to achieve the rigid registration, since my focus was only to replicate results of Mira. Without rigid registration, I relied on applying previous frame Deformation Field to current frame and then performed registration toward canonical frame.

b. More importantly, thank you. I had similar discussion with Mira but she didn't mention to me anything about ignoring these untouched voxels. So, I was in doubt if this is what I should implement. Your confirmation means a lot.

This may have certain side-effects, like the voxels on the border of the SDF being dragged slower

This is still fine. It will lead to lower convergence, but without this Killing energy will just explode. Furthermore, it might not be 1/2, since I can fallback to forward/backward difference instead of central). The only disadvantage is central difference would have been a bit less noisy compared to forward or backward difference.