fudan-zvg/S-NeRF

bad render images on evaluation

szhang963 opened this issue · 15 comments

Hi, I appreciate your work very much. I encountered some questions when I reimplemented your work.
I got a high PSNR (about 46+) for the training in scene-0916 of nuscene. However, I got a poor PSNR and bad render image when I evaluated the model using eval.py and the quality of each frame will be worse than the previous frame (e.g. 20.123, 19.45, 17.32, 16.34 for PSNR). Could you provide some suggestions for me to solve it?
image
Thank you in advance.

@szhang963
I got similar result

psnr: 22.334688186645508
Evaluating 2/60
psnr: 21.38553237915039
Evaluating 3/60
psnr: 20.18088722229004
Evaluating 4/60
psnr: 19.331192016601562
Evaluating 5/60
psnr: 18.859142303466797
Evaluating 6/60

And I train it for 5w epochs
My result is a little bit of blurred(Maybe need to train more epochs)
image
image

How many epochs did you train?

The epoch is 8.5w for the above evaluation. I think there are some hard codes and inconsistencies for training and evaluation.

hey, did you just execute render.py for rendering? did you have to change the paths anywhere?

I execute 'eval.py' to get rendered results in a default setting.

Also, the intrinsics is set to None in train_depends in line99 dataloader.py but later it's being accessed in line144 eval.py. Did you make any changes to that?
I used nuScenes mini dataset for training

@szhang963 can you check the values and size of intrinsics in dataloader.py ? As I've mentioned above, it's set to None and later being accessed. I will randomize the values if I get the size of the tensor. Thanks

Thanks for your reply. I also used neScenes mini 0916 for training. Why is the intrinsic set to None for evaluation? Should I set the values to the default value from the camera sensor?

Also, the intrinsics is set to None in train_depends in line99 dataloader.py but later it's being accessed in line144 eval.py. Did you make any changes to that? I used nuScenes mini dataset for training

@amoghskanda I fix the code 'images, poses, _, intrinsics, depth_gts, flow, cam_index, skymask, seg_masks, semantic = train_depends'.
Could you share the fixed codes or key modifications for the right rendered result?

@szhang963 I'm not able to render because of an error I'm facing. NoneType object is not subscriptable on Line352 in model/models.py. chunk_results is a list of tensors but the last tensor is None hence the error. You did not face this issue? Did you change anything at all while training/rendering? I just reduced the batch size and changed something for lrate(learning rate)
Thanks

@szhang963 I changed the 'render_image' function in models.py.


def render_image(render_fn, rays, rank, chunk=8192):
    is_semantic = render_fn.func.module.semantic
    n_devices = torch.cuda.device_count()  # one device for render
    height, width = rays[0].shape[:2]
    num_rays = height * width
    rays = utils.namedtuple_map(lambda r: r.reshape((num_rays, -1)), rays)
    results = []

    for i in range(0, num_rays, chunk):
        # pylint: disable=cell-var-from-loop
        chunk_rays = utils.namedtuple_map(lambda r: r[i:i + chunk], rays)
        chunk_size = chunk_rays[0].shape[0]
        rays_remaining = chunk_size % torch.cuda.device_count()
        if rays_remaining != 0:
            padding = n_devices - rays_remaining
            chunk_rays = utils.namedtuple_map(
                # mode = "edge", not reflect
                lambda r: F.pad(r, (0, padding, 0, 0), mode='reflect'), chunk_rays)
        else:
            padding = 0
        # After padding the number of chunk_rays is always divisible by
        # host_count.
        rays_per_host = chunk_rays[0].shape[0]
        start, stop = 0 * rays_per_host, (0 + 1) * rays_per_host
        chunk_results = render_fn(chunk_rays)[-1]
        if not is_semantic: 
            chunk_results = chunk_results[:-1]
        # chunk_results = [res.to('cpu') for res in chunk_results]
        results.append([utils.unshard(x[None, ...], padding)
                       for x in chunk_results])
    if is_semantic: 
        rgb, distance, acc, semantic_logits = [
            torch.cat(r, axis=0) for r in zip(*results)]
    else:
        rgb, distance, acc = [
            torch.cat(r, axis=0) for r in zip(*results)]
    rgb = rgb.reshape((height, width, -1))
    distance = distance.reshape((height, width))
    acc = acc.reshape((height, width))
    if is_semantic: 
        semantic = semantic_logits.reshape((height, width, -1))
    else:
        semantic = None
    return (rgb, distance, acc, semantic)

And, the code was changed for right input of intrinsic.
'images, poses, _, intrinsics, depth_gts, flow, cam_index, skymask, seg_masks, semantic = train_depends'

@ZiYang-xie Hi, Could you please spare some time to assist me with this matter? I would greatly appreciate your help.

@szhang963 thank you so much. I'm able to render the images now. Did you try experimenting on the renders like change the view or change the ego vehicle trajectory mentioned in the paper? And yes, the psnr is poor for me as well and diminishes after every eval.

@amoghskanda No, the problem of rendering needs to be solved first.

@ZiYang-xie thank you for this great work. Any insights on szhang's question? How do we improve the rendering results for nuScenes?

@ZiYang-xie thank you for this great work. Any insights on szhang's question? How do we improve the rendering results for nuScenes?

Thank you for reaching out with your question. I'll look into this problem shortly. To address the issue you're encountering, I recommend experimenting with a different scene given the complexity of scene-0916 or try Waymo instead of nuScenes for the reconstruction. Please consider this approach, and feel free to reach out if you need further guidance or assistance.