sarafridov/K-Planes

Question about result of DyNeRF dataset : a difference between the results in the paper and my results

Opened this issue · 6 comments

Hi,

Thank you for sharing your work and the interesting model you've developed.
I hope that your model becomes the new standard for Dynamic NeRF models.

I trained the K-planes hybrid version using the DyNeRF dataset, following your instructions without making any other changes to the configuration. Here's what I did:

  1. Precomputed IST & ISG weights by setting downsample=4, global_step=1.
  2. Trained the model by setting downsample=2, global_step=90001 back again.

For training, I used one A100 32GB and torch=1.10.1.

Then, I evaluated models with --validate-only and --log-dir /path/to/dir_of_model

I compared my results with the results from the supplementary paper, and I marked my results in red.

image

There seems to be a mismatch in the metrics, and the performance appears to be worse than MixVoxel-L.
My question is, how can I achieve the same results as described in the paper?

Best regards.

Hmm I'm not sure what could be going on here since I am able to reproduce the numbers in the paper (at least I checked for the salmon scene), but some things you can try are: (1) in case there's any issue with the preprocessing for the ray sampling weights, you can try using some that I uploaded for a few of the scenes (the .pt files here for salmon and in flamesteak_explicit and searsteak_explicit). If you download these and put them in your data directory, then you can go straight to the full training with the default config and it should work. (2) I doubt this would be the issue, but you can double check that your python environment matches mine in case this is due to some package mismatch; I uploaded my environment package list with the version info here. Note that this includes some unnecessary packages along with the ones that are actually used (sorry about that, we might upload a cleaner version later).

Hmm I'm not sure what could be going on here since I am able to reproduce the numbers in the paper (at least I checked for the salmon scene), but some things you can try are: (1) in case there's any issue with the preprocessing for the ray sampling weights, you can try using some that I uploaded for a few of the scenes (the .pt files here for salmon and in flamesteak_explicit and searsteak_explicit). If you download these and put them in your data directory, then you can go straight to the full training with the default config and it should work. (2) I doubt this would be the issue, but you can double check that your python environment matches mine in case this is due to some package mismatch; I uploaded my environment package list with the version info here. Note that this includes some unnecessary packages along with the ones that are actually used (sorry about that, we might upload a cleaner version later).

Could you upload DyNeRF full scenes hybrid pretrained models? Thanks a lot!

Hi,

Thank you for sharing your work and the interesting model you've developed. I hope that your model becomes the new standard for Dynamic NeRF models.

I trained the K-planes hybrid version using the DyNeRF dataset, following your instructions without making any other changes to the configuration. Here's what I did:

  1. Precomputed IST & ISG weights by setting downsample=4, global_step=1.
  2. Trained the model by setting downsample=2, global_step=90001 back again.

For training, I used one A100 32GB and torch=1.10.1.

Then, I evaluated models with --validate-only and --log-dir /path/to/dir_of_model

I compared my results with the results from the supplementary paper, and I marked my results in red.

image

There seems to be a mismatch in the metrics, and the performance appears to be worse than MixVoxel-L. My question is, how can I achieve the same results as described in the paper?

Best regards.

May I ask if you got the correct result later? My results are even a little worse than yours, especially the SSIM metric.

Sorry, I tried 2~3 times again, but couldn't get it.

I used your pre-trained model to get metrics, but also differently from the paper metrics (testdata downsamle×2), especially the ssim metrics. Is there any difference in how ssim metrics are calculated?

@ minsu1206 The results compared in the SSIM metric are similar to the values given in the MS-SSIM metric in pre-trained results shared by the authors, so they must have compared with the same metric in all the other models as well.
@sarafridov, is this the case?
By the way, the MS-SSIM metric should be a more reliable metric for videos.