Performance gap on NERF synthetic dataset

As mentioned in another issue #2, using the default config like nerf-pytorch does not get comparable performance to Instant-NGP.

Hi @walsvid, this is indeed the case, thanks for pointing out! Although the renders look "good", the PSNR is not as good as reported values in Instant-NGP.

Note: You should look at the testset PSNR (not the training PSNR). I have just pushed some code to print these values. Pull the latest master branch.

Here's an example with Lego dataset: PSNR on test set = 28.89. Corresponding renders:

video.mp4

To be honest, I am not sure why this is the case. We can different values for finest_res (the above result is with 1024). If you figure out the reason, please let me know!

A possible reason is that instant-ngp loads all train/val/test data to train (see here) for nerf_synthetic dataset?

Seems that if we add the TV_loss's iteration, PSNR could rise. (in run_nerf.py line 876)

so, can someone give a benchmark for this repo v.s. official instant-ngp repo? it would be very useful!

Hi @yashbhalgat , may I know what is the expected difference in training speed of This repo, pytorch-nerf and instant-ngp?
I ran the pytorch-nerf and this repo and did not find any such significant gain in speed (which is the main purpose of instant-ngp)
Thank You.

@Feynman1999 regarding benchmarks v/s Instant-NGP, I will try to get this ready, but it won't be any time soon as I am currently caught up with other projects. If you or someone else can work on this, feel free to open a pull request. :)

@shreyk25,

this is a pure PyTorch(+Python) implementation, so it isn't as fast as the CUDA(C++) implementation by Instant-NGP.
Compared to nerf-pytorch, the iterations/second speed during training is almost the same. Although, as mentioned in the README, the HashNeRF algorithm converges much faster (refer the video below). For the "chair" dataset, HashNeRF converges in about 3000 iterations (which translates to 15 minutes training time) while vanilla NeRF (nerf-pytorch) will take a few hours to reach similar performance. I have observed approximately 20x convergence speedup compared to nerf-pytorch.

Chair.Convergence.mp4

Hope this helps. :)

Seems that if we add the TV_loss's iteration, PSNR could rise. (in run_nerf.py line 876)

have you tried this tvloss? how it affect psnr?

A possible reason is that instant-ngp loads all train/val/test data to train (see here) for nerf_synthetic dataset?

The author just clarified to me that they only use the train split to train: kwea123/ngp_pl#1

I verify that "hash encoding + small MLP" has faster convergence speed than “vanilla large MLP”, see this figure (Bottom is HashNeRF, while the above is vanilla NeRF) :

at 500 iteration:

at 1000 iteration:

Thanks for the great work @yashbhalgat , do you now have any idea about the performance (numerical results) gap between instant and this implementation? I guess there may be some lack of essential implementation details in this repo, but I am not able to find it out...

@zParquet I am wondering the same and searching for quite a while now, because the gap is quite large. Two differences I found so far are:

Here, small eps for Adam is only applied to hash table entries, yet in the paper it seems they use it also for the MLP.

HashNeRF-pytorch/run_nerf.py

Line 265 in 425e70b

{'params': embedding_params, 'eps': 1e-15}
They mention the mapping between grid and table to be 1:1 on lower resolutions. This injectivity is not given in this implementation, because in theory there could also happen collisions on the lower resolutions. However they are very unlikely, which makes the effect on PSNR questionable.

Further ideas are welcomed :D

This work seems to reach atleast PSNR 34, however they are using an own Cuda version of the encoding.
https://github.com/ashawkey/torch-ngp

After reading the appendix E.3, it seems a huge benefit comes from a (very) large number of rays in each batch, at the cost of fewer samples. However these fewer samples seem to be possible because of their additional nested occupancy grids.

The batch size has a significant effect on the quality and speed of NeRF convergence. We found that training from a larger number of rays, i.e. incorporating more viewpoint variation into the batch, converged to lower error in fewer steps.

EDIT: Author of the original paper states something similar: NVlabs/instant-ngp#118