Jiang Wu1 Β· Rui Li1,2 Β· Haofei Xu2,3 Β· Wenxun Zhao1 Β· Yu Zhu1
Jinqiu Sun1 Β· Yanning Zhang1
1Northwestern Polytechnical University Β· 2ETH ZΓΌrich Β· 3University of TΓΌbingen, TΓΌbingen AI Center
CVPR 2024
In this paper, we propose GoMVS to aggregate geometrically consistent costs, yielding better utilization of adjacent geometries. Specifically, we correspond and propagate adjacent costs to the reference pixel by leveraging the local geometric smoothness in conjunction with surface normals. We achieve this by the geometric consistent propagation (GCP) module. It computes the correspondence from the adjacent depth hypothesis space to the reference depth space using surface normals, then uses the correspondence to propagate adjacent costs to the reference geometry, followed by a convolution for aggregation. Our method achieves new state-of-the-art performance on DTU, Tanks & Temple, and ETH3D datasets. Notably, our method ranks 1st on the Tanks & Temple Advanced benchmark.
conda create -n gomvs python=3.8 # use python 3.8
conda activate gomvs
pip install -r requirements.txt
#Our code is tested on NVIDIA A100 GPUs (with python=3.8 ,pytorch=1.12, cuda=11.6)
We first train the model on DTU dataset and then fine-tune on the BlendedMVS dataset.
You can download the DTU training data and Depths_raw (preprocessed by MVSNet), and unzip them like:
dtu_training
βββ Cameras
βββ Depths
βββ Depths_raw
βββ Rectified
For DTU testing set, you can also download the preprocessed DTU testing data and unzip it as the test data folder, which should be:
dtu_testing
βββ Cameras
βββ scan1
βββ scan2
βββ ...
Download the low-res set from BlendedMVS and unzip it like below:
BlendedMVS
βββ 5a0271884e62597cdee0d0eb
β βββ blended_images
β βββ cams
β βββ rendered_depth_maps
βββ 59338e76772c3e6384afbb15
βββ 59f363a8b45be22330016cad
βββ ...
βββ all_list.txt
βββ training_list.txt
βββ validation_list.txt
You can download the Tanks and Temples here and unzip it. For the intermediate set, unzip "short_range_caemeras_for_mvsnet.zip" and replace the camera parameter files inside the "cam" folder with them.
tanksandtemples
βββ advanced
β βββ Auditorium
β βββ ...
βββ intermediate
βββ Family
βββ ...
We provide the normal maps generated by the Omnidata monocular normal estimation network. We generate high-resolution normals using scripts provided by MonoSDF and you can download our preprocessed normal maps.
Download DTU Training and DTU Test normal maps and unzip it as follows:
dtu_training_normal
βββ scan1_train
| βββ 000000_normal.npy
β βββ 000000_normal.png
| βββ ...
βββ scan2_train
βββ ...
dtu_test_normal
βββ scan1
| βββ 000000_normal.npy
β βββ 000000_normal.png
| βββ ...
βββ scan4
βββ scan9
βββ ...
We also provide the normal maps for the BlendedMVS low-res set. You can down it here.
BlendedMVS_normal
βββ 5a0271884e62597cdee0d0eb
| βββ 000000_normal.npy
β βββ 000000_normal.png
β βββ ...
βββ 59338e76772c3e6384afbb15
βββ 59f363a8b45be22330016cad
βββ ...
The structure of the TNT normal data is as follows.
TnT_normal
βββ Auditorium
βββ ...
βββ Family
βββ ...
| βββ 000000_normal.npy
β βββ 000000_normal.png
| βββ ...
We provide scripts in the 'scripts' folder for training and testing on different datasets. Before running, please configure the paths in the script files.
Specify MVS_TRAINING
and NORMAL_PATH
in scripts/train/*.sh
.
Then, you can run the script to train the model on DTU dataset:
bash scripts/train/train_dtu.sh
To finetune the model on BlendedMVS, set CKPT
as path of the pre-trained model on DTU dataset. Then run:
bash scripts/train/train_bld_fintune.sh
To start testing on DTU, set the configuration in scripts/test/test_dtu.sh
,
and run:
bash scripts/test/test_dtu.sh
For testing on the TNT dataset, use 'test_tnt.sh' to generate depth maps and 'dynamic_fusion.sh' to generate the final point results.
bash scripts/test/test_tnt.sh
bash scripts/test/dynamic_fusion.sh
Setting CKPT_FILE
to the fine-tuned model may yield better results.
Methods | Acc. | Comp. | Overall. |
---|---|---|---|
GoMVS(Pre-trained model) | 0.347 | 0.227 | 0.287 |
Methods | Mean | Aud. | Bal. | Cou. | Mus. | Pal. | Temp. |
---|---|---|---|---|---|---|---|
GoMVS(Pre-trained model) | 43.07 | 35.52 | 47.15 | 42.52 | 52.08 | 36.34 | 44.82 |
Methods | Mean | Fam. | Fra. | Hor. | Lig. | M60 | Pan. | Pla. | Tra. |
---|---|---|---|---|---|---|---|---|---|
GoMVS(Pre-trained model) | 66.44 | 82.68 | 69.23 | 69.19 | 63.56 | 65.13 | 62.10 | 58.81 | 60.80 |
Please cite our paper if you use the code in this repository:
@inproceedings{wu2024gomvs,
title={GoMVS: Geometrically Consistent Cost Aggregation for Multi-View Stereo},
author={Wu, Jiang and Li, Rui and Xu, Haofei and Zhao, Wenxun and Zhu, Yu and Sun, Jinqiu and Zhang, Yanning},
booktitle={CVPR},
year={2024}
}
We borrow the code from TransMVSNet, GeoMVSNet, IronDepth. We express gratitude for these works' contributions!