bennyguo/instant-nsr-pl

[Discussion] Make `eps` in the finite difference exponentially decreasing

GCChen97 opened this issue · 34 comments

I just notice that the finite difference in geometry.py is similar to the numerical gradient computation of Neuralangelo (CVPR 2023).
Maybe the exponentially decreasing eps strategy in Neuralangelo can be adopted since the eps is constant in this codebase : )

I also noticed this paper. Seems like an easy adaptation! I'll integrate this feature soon after some experiments.

@GCChen97 @xiaohulihutu @alvaro-budria
I've pushed an implementation of Neuralangelo. I haven't got the curvature loss to work so feel free to play with it and make it better :)

See here for details.

Thx! I will check the new code and try to come up with something for the curvature loss.

Great! In my experiments, the curvature loss (currently commented out in the config file) works on scene63 but will "blow things up" on scene24 after 15000 iterations. So if you would like to tune the curvature loss, scene24 would be a good start :)

Hi, I am unsure about the current computation of the Laplacian:
laplace = (points_d_sdf[..., 0::2] + points_d_sdf[..., 1::2] - 2 * sdf[..., None]) / (eps ** 2)

It assigns a weight of 2 to the center sample, instead of 6 (the number of "neighbors"). In wikipedia the following formula is shown:
Screenshot from 2023-06-04 12-38-42,
where setting $\gamma_1 = \gamma_2 = 0$ results in the center being weighted by 6.

I think you're right. In the current code, laplace is:
$$\left(\frac{\partial^2 f}{\partial x^2},\frac{\partial^2 f}{\partial y^2} ,\frac{\partial^2 f}{\partial z^2} \right)$$
And the curvature loss is:
$$\mathcal{L}=\frac{1}{N}\sum\left(\left|\frac{\partial^2 f}{\partial x^2}\right|+\left|\frac{\partial^2 f}{\partial x^2}\right|+\left|\frac{\partial^2 f}{\partial x^2}\right|\right)$$

However, the Laplacian should be:
$$\nabla^{2}f = \frac{\partial^2 f}{\partial x^2}+\frac{\partial^2 f}{\partial y^2} +\frac{\partial^2 f}{\partial z^2}$$
and the curvature loss should be (denoted as b):
$$\mathcal{L}^{'}=\frac{1}{N}\sum\left|\nabla^{2}f\right|$$

$\mathcal{L}$ and $\mathcal{L}^{'}$ are clearly not the same.

I think simply changing

laplace = (points_d_sdf[..., 0::2] + points_d_sdf[..., 1::2] - 2 * sdf[..., None]) / (eps ** 2)

with

laplace = (points_d_sdf[..., 0::2] + points_d_sdf[..., 1::2] - 2 * sdf[..., None]).sum(-1) / (eps ** 2)

should be correct?

Yes, that seems correct. The code was taking the absolute value of each partial derivative, instead of summing them and then taking the absolute value.

Are you interested in verifying the correctness of this new version? i.e., does it consistently bring quality improvements? If it does, I'd appreciate it if you open a PR.

Sure, I can check if this new curvature penalty improves the results.

The linear warmup for the curvature loss weight is not implemented, and in the paper they mention that this can be important for scenes like dtu24 where there are concave shapes. I was thinking of adding an update to this lambda in an on_train_batch_end callback, but I am unsure this is the right approach with PyTorch Lightning.

There are a few hyperparameters whose value currently differs from that in Neurangelo. In the paper, the hash feature dim is 8, and the max number of hash entries per resolution is $2^{22}$. In the configs, the values are 2 and $2^{19}$. I will keep those in the config to make a fair comparison.

The warm-up is implemented -- you can linear increase/decrease a loss weight by assigning a tuple with four numbers: [start_step, start_value, end_value, end_step], then the weight value will linear change from start_value to end_value during start_step to end_step. I listed some differences with the original paper here and I think it would be fine to keep them unchanged for fair comparisons.

I tried the corrected Laplacian computation with a subset of DTU scan scenes. I could not observe any improvement in terms of PSNR.

PSNR 24 37 40 55 63 65
$\lambda_{curv}=0$ 31.74 26.75 30.83 31.19 35.42 32.54
$\lambda_{curv}=0.1$ 24.05 - - - -
$\lambda_{curv} = 10^{-3}$ 24.06 - - - 31.67 -
$\lambda_{curv} = 5 \cdot 10^{-5}$ 31.60 26.81 30.75 30.83 34.47 32.60

** for the 4th and 5th rows, I added a warmup on the weight of 5000 iterations, as the geometry was getting stuck in the low-curvature sphere initialization.

Looking at the generated images, the surface does seem smoother in the case of DTU24, on the roof (which is actually not desirable):
Screenshot from 2023-06-04 21-18-40

but in DTU63, there is no significant change. No curvature loss:
Screenshot from 2023-06-04 21-23-09

With $\lambda_{curv} = 5 \cdot 10^{-5}$:
Screenshot from 2023-06-04 21-19-43

I suspect that having longer iterations at each level of detail (5000 in the paper vs. 1000 here) is an important factor as it allows setting higher weight to the curvature penalty and gives more time to the net to adapt at each level, but I have not verified.

Thank you! I also think the number of training steps is crucial. Considering this should be the correct implementation, could you open a PR? I'll do some experiments on the new code.

I tried the corrected Laplacian computation with a subset of DTU scan scenes. I could not observe any improvement in terms of PSNR.

PSNR 24 37 40 55 63 65




0
31.74 26.75 30.83 31.19 35.42 32.54




0.1
24.05 - - - -




10

3
24.06 - - - 31.67 -




5

10

5
31.60 26.81 30.75 30.83 34.47 32.60
** for the 4th and 5th rows, I added a warmup on the weight of 5000 iterations, as the geometry was getting stuck in the low-curvature sphere initialization.

Looking at the generated images, the surface does seem smoother in the case of DTU24, on the roof (which is actually not desirable): Screenshot from 2023-06-04 21-18-40

but in DTU63, there is no significant change. No curvature loss: Screenshot from 2023-06-04 21-23-09

With λcurv=5⋅10−5: Screenshot from 2023-06-04 21-19-43

I suspect that having longer iterations at each level of detail (5000 in the paper vs. 1000 here) is an important factor as it allows setting higher weight to the curvature penalty and gives more time to the net to adapt at each level, but I have not verified.

Hi, how did you add the warmup?

I changed some parameters and decayed curvature loss, then I got (DTU 24, psnr=32.0) :
image

@flow-specter Looks good! Could you share the configuration file?

@flow-specter Looks good! Could you share the configuration file?

Sure~
Here is the configuration file:

name: neuralangelo-dtu-wmask-scan24
tag: decayCurvature
seed: 42
dataset:
name: dtu
root_dir: /data/DTU/scan24
cameras_file: cameras.npz
img_downscale: 2
n_test_traj_steps: 60
apply_mask: true
model:
name: neus
radius: 1.0
num_samples_per_ray: 1024
train_num_rays: 256
max_train_num_rays: 8192
grid_prune: false
grid_prune_occ_thre: 0.001
dynamic_ray_sampling: true
batch_image_sampling: true
randomized: true
ray_chunk: 2048
cos_anneal_end: 500000
learned_background: false
background_color: white
variance:
init_val: 0.3
modulate: false
geometry:
name: volume-sdf
radius: 1.0
feature_dim: 13
grad_type: finite_difference
finite_difference_eps: progressive
isosurface:
method: mc
resolution: 512
chunk: 2097152
threshold: 0.0
xyz_encoding_config:
otype: ProgressiveBandHashGrid
n_levels: 16
n_features_per_level: 8
log2_hashmap_size: 22
base_resolution: 32
per_level_scale: 1.3195079107728942
include_xyz: true
start_level: 4
start_step: 20000
update_steps: 5000
mlp_network_config:
otype: VanillaMLP
activation: ReLU
output_activation: none
n_neurons: 64
n_hidden_layers: 1
sphere_init: true
sphere_init_radius: 0.5
weight_norm: true
texture:
name: volume-radiance
input_feature_dim: 16
dir_encoding_config:
otype: SphericalHarmonics
degree: 4
mlp_network_config:
otype: VanillaMLP
activation: ReLU
output_activation: none
n_neurons: 64
n_hidden_layers: 2
color_activation: sigmoid
system:
name: neus-system
loss:
lambda_rgb_mse: 0.0
lambda_rgb_l1: 1.0
lambda_mask: 0.1
lambda_eikonal: 0.1
lambda_curvature:
- 5000
- 0.0005
- 0.0
- 500000
lambda_sparsity: 0.0
lambda_distortion: 0.0
lambda_distortion_bg: 0.0
lambda_opaque: 0.0
sparsity_scale: 1.0
optimizer:
name: AdamW
args:
lr: 0.01
betas:
- 0.9
- 0.99
eps: 1.0e-15
params:
geometry:
lr: 0.01
texture:
lr: 0.01
variance:
lr: 0.001
constant_steps: 5000
scheduler:
name: SequentialLR
interval: step
milestones:
- 5000
schedulers:
- name: ConstantLR
args:
factor: 1.0
total_iters: 5000
- name: ExponentialLR
args:
gamma: 0.9999953483237626
checkpoint:
save_top_k: -1
every_n_train_steps: 500000
export:
chunk_size: 2097152
export_vertex_color: true
trainer:
max_steps: 500000
log_every_n_steps: 100
num_sanity_val_steps: 0
val_check_interval: 5000
limit_train_batches: 1.0
limit_val_batches: 2
enable_progress_bar: true
precision: 32
cmd_args:
config: configs/neuralangelo-dtu/neuralangelo-dtu-wmask.yaml
gpu: '0'
resume: null
resume_weights_only: false
train: true
validate: false
test: false
predict: false
exp_dir: ./exp
runs_dir: ./runs
verbose: false
trial_name: decayCurvature@20230607-154540
exp_dir: ./exp/neuralangelo-dtu-wmask-scan24
save_dir: ./exp/neuralangelo-dtu-wmask-scan24/decayCurvature@20230607-154540/save
ckpt_dir: ./exp/neuralangelo-dtu-wmask-scan24/decayCurvature@20230607-154540/ckpt
code_dir: ./exp/neuralangelo-dtu-wmask-scan24/decayCurvature@20230607-154540/code
config_dir: ./exp/neuralangelo-dtu-wmask-scan24/decayCurvature@20230607-154540/config

Besides, I found that there is appearance embedding in neuralangelo, would you consider adding this? @bennyguo

Besides, I found that there is appearance embedding in neuralangelo, would you consider adding this? @bennyguo

I think the appearance embedding aims to handle varied exposure in the Tanks and Temples dataset. We're currently experimenting on DTU so it's not really needed:)

Honestly, I am doing some experiments on Tanks and Templates :)

@flow-specter Are you interested in contributing a tanks-and-temples dataset? I'll let you know if I've designed an elegant way to incorporate appearance embeddings.

@flow-specter Are you interested in contributing a tanks-and-temples dataset? I'll let you know if I've designed an elegant way to incorporate appearance embeddings.

I am willing to, but I have been busy with work lately. If I have free time to upload, I will let you know~

Hi,

I'm working on tanks-and-temples dataset and got an initial result here. As the author didn't give mesh visualization on the truck dataset, I don't know how it looks compared to Neuralangelo.

Settings:
dataset:
name: colmap
root_dir: /TanksandTemples/colmap_truck
img_downscale: 4
up_est_method: ground
center_est_method: lookat
n_test_traj_steps: 120
apply_mask: false
load_data_on_gpu: false
model:
name: neus
radius: 2.0

Others following the dtu config given above.

Furthermore, I didn't add per-image latent embedding to the color network. The background is rendered following NeuS.

it500000-test.mp4

@youmi-zym The results look fair. You may consider setting a smaller radius as a large part of the background is now modeled by the foreground.

Hi,

According to the paper description:
image

I think the curve loss weight should be:

lambda_curvature:
- 0
- 0.0
- 0.0005
- 5000

rather than

lambda_curvature:
- 5000
- 0.0005
- 0.0
- 500000

as shown above.

@flow-specter Are you interested in contributing a tanks-and-temples dataset? I'll let you know if I've designed an elegant way to incorporate appearance embeddings.

I drafted a branch with appearance embeddings, can test it by simply adding the following configurations

    use_appearance_embedding: true
    use_average_appearance_embedding: ture
    appearance_embedding_dim: 17

PSNR seems to be a bit higher for my customized outdoor dataset, although it is not obvious in Mip-NeRF 360 dataset

Could anyone reproduce similar results on tanks&temples, e.g. meeting room and courtroom? I used numerical graident, coarse-to-fine optimization of grid and appearance embedding. however, i did not observe explicit improvement in reconstruction. Besides, it is superising that the baseline neus perfom so well on tanks, better than geo-neus.

Could anyone reproduce similar results on tanks&temples, e.g. meeting room and courtroom? I used numerical graident, coarse-to-fine optimization of grid and appearance embedding. however, i did not observe explicit improvement in reconstruction. Besides, it is superising that the baseline neus perfom so well on tanks, better than geo-neus.

It's amazing the neus with instant-ngp is so good - outperforms the original neus by a large margin. And from the PSNR in my testing, it seems the performance gain of neuralangelo is more from instant-ngp instead of the tricks proposed in the paper.

Could anyone reproduce similar results on tanks&temples, e.g. meeting room and courtroom? I used numerical graident, coarse-to-fine optimization of grid and appearance embedding. however, i did not observe explicit improvement in reconstruction. Besides, it is superising that the baseline neus perfom so well on tanks, better than geo-neus.

Hi, may I ask on what metrics did you compare your experiments with Geo-Neus, since the Geo-Neus paper only provide Chamfer Distance while this repo seems to only provide a PSNR evaluation.

Could anyone reproduce similar results on tanks&temples, e.g. meeting room and courtroom? I used numerical graident, coarse-to-fine optimization of grid and appearance embedding. however, i did not observe explicit improvement in reconstruction. Besides, it is superising that the baseline neus perfom so well on tanks, better than geo-neus.

Hi, may I ask on what metrics did you compare your experiments with Geo-Neus, since the Geo-Neus paper only provide Chamfer Distance while this repo seems to only provide a PSNR evaluation.

I am referring to the chamfer distance values of geoneus and neus on tanksandtemples in the paper.

Could anyone reproduce similar results on tanks&temples, e.g. meeting room and courtroom? I used numerical graident, coarse-to-fine optimization of grid and appearance embedding. however, i did not observe explicit improvement in reconstruction. Besides, it is superising that the baseline neus perfom so well on tanks, better than geo-neus.

Hi @FangjinhuaWang , could you share your configuration for tanks&temples? I'm also interested in reproducing the result, but the result mesh of barn shows many holes on ground and roof.
61

Could anyone reproduce similar results on tanks&temples, e.g. meeting room and courtroom? I used numerical graident, coarse-to-fine optimization of grid and appearance embedding. however, i did not observe explicit improvement in reconstruction. Besides, it is superising that the baseline neus perfom so well on tanks, better than geo-neus.

Hi @FangjinhuaWang , could you share your configuration for tanks&temples? I'm also interested in reproducing the result, but the result mesh of barn shows many holes on ground and roof. 61

Hello @imbinwang,

I met the same problem with you. Have you solved this 'hole' problem?

Best
Mulin

Could anyone reproduce similar results on tanks&temples, e.g. meeting room and courtroom? I used numerical graident, coarse-to-fine optimization of grid and appearance embedding. however, i did not observe explicit improvement in reconstruction. Besides, it is superising that the baseline neus perfom so well on tanks, better than geo-neus.

Hi @FangjinhuaWang , could you share your configuration for tanks&temples? I'm also interested in reproducing the result, but the result mesh of barn shows many holes on ground and roof. 61

Hello ,@imbinwang , Can you provide me with the TAT configuration file and the version of instant nsr-pl? I am unable to render TAT results using Neus.
Thanks!