lab4d-org/lab4d

[Bug/issue]Not complete Pikachu in training time.

Closed this issue · 8 comments

I followed the tutorial RECONSTRUCT A CAT FROM A SINGLE VIDEO and get reasonable result in test time. The result got by "lab4d/render.py" at frameid=120 is:
image

But when i try to render 4 orthogonal perspective at frameid=120 in training time, the rgb(also the mask) are not complete.
Screenshot 2023-10-31 193238

It should be like this:
Screenshot 2023-10-31 175551

I think it is a problem about the mask. Could you please provide me some advise?

Hi, could you clarify how are these rendered at training time? Is it through the model_eval() function?

Since render.py and evaluate() calls the same rendering function, I would expect them to be consistent.

It is through the model_eval() function.
I think invisibility is caused by the mask? Because the mask of these views are not supervised by groundtruth

Are you able to render the turntable view successfully? (which does not have groud-truth as well)

If so, the issue would be related to the way things are wired up in the training loop

from render.py (drag for slow motion):

rgb.mp4

from dvr_model.evaluate in training time:

bob-mvd-6lr120ep-1rgbanneal-4_rgb_eval

Notice: I use dvr_model.fields.set_alpha(0) for more accurate result. Do i need to dvr_model.fields.set_beta_prob(0)?

or is this a near_far problem?

The over-smoothed shape is likely caused by set_alpha(0). set_alpha() is a part of the coarse-to-fine annealing procedure during training and should not be touched.

Also, during eval() mode rendering, the rendered mask blends pixels with black background here, so you don't see structures beyond it.

More info:

  • set_alpha(1) gives you all frequency coefficients of the positional code and set_alpha(0) limits to only the [xyz] component. This is the same as Nerfies.
  • set_beta_prob() is only useful for multi-instance training. It changes the probability of code swapping as in RAC.

Thanks for your patient reply!
Sorry there was a mistake. I set dvr_model.fields.set_alpha(1). Because the training progress is started from a ckpt with 120rounds. I just want to get the most clear rendered images to do my optimization(Not previous lab4d loss) If i start with alpha=0, then the image is coarse.
So, why the mask is not correct at my training time? Could you give some advise?

It might be a near-far plane issue if this only happens for the novel views.

To solve it, using this to compute near-far plane on-the-fly, replacing these line

            corners = trimesh.bounds.corners(self.proxy_geometry.bounds)
            corners = torch.tensor(corners, dtype=torch.float32, device=device)
            field2cam_mat = quaternion_translation_to_se3(field2cam[0], field2cam[1])
            near_far = get_near_far(corners, field2cam_mat, tol_fac=1.5)

It is now going to be a correct pikachu! Thank you, gengshan. Having a nice day!