Question about reconstruction strategy in iterative novel view synthesis
Closed this issue · 5 comments
Sorry for opening a new issue because I afraid you could not find my follow-up question to the issue #14 (comment) well.
Could I ask for more details?
If we already have a initial Dust3R reconstruction on input sparse views, what is the strategy of reconstruction for Dust3R and alignment in progress?
(1) One way I think is to input all input sparse views and all generated views to Dust3R and output the global aligned scene at each step.However, this way will be prohibitively expensive in GPU resources when the number of images becomes more than 30.
(2) Another way seems to establish a initial Dust3R reconstruction as a global reconstruction and reconstruct newly generated 25 novel views as a local one. However, in order to align the global one and local one, the correspondences is needed. How do you solve the correspondences?
Thanks!
What is the difference between nvs_single_view_ref_iterative function and nvs_single_view_1drc_iterative fucntion?
Hi, sorry for the late response.
We run DUSt3R for all existing and generated views at each step. Instead of using all 25 frames, we sample 3 or 5 views from them for reconstruction.
Also the nvs_single_view_ref_iterative function and nvs_single_view_1drc_iterative fucntion seems to be identical.
The nvs_single_view_ref_iterative
function performs view synthesis that always start from the reference image. For example, the first step involves a left-turning trajectory that starts from the reference image, and all subsequent steps also starts from the reference image. In contrast, the nvs_single_view_1drc_iterative
function uses the last frame of the generated novel view as the starting point for the next step.
Also the nvs_single_view_ref_iterative function and nvs_single_view_1drc_iterative fucntion seems to be identical.
The
nvs_single_view_ref_iterative
function performs view synthesis that always start from the reference image. For example, the first step involves a left-turning trajectory that starts from the reference image, and all subsequent steps also starts from the reference image. In contrast, thenvs_single_view_1drc_iterative
function uses the last frame of the generated novel view as the starting point for the next step.
From my understanding of the code, the iterative process will generate 25 frames 25 times right? (Since the video model generates 25 frames each iteration)
But each time we replace the point cloud projection with the new inpainted view that was generated at the previous iteration.
Did I understand the process correctly?
Hello, guys! My problem has been solved. I will close this issue.