BlendedMVS

About

BlendedMVS is a large-scale MVS dataset for generalized multi-view stereo networks. The dataset contains 17k MVS training samples covering a variety of 113 scenes, including architectures, sculptures and small objects.

Upgrade to BlendedMVG

BlendedMVG, a superset of BlendedMVS, is a multi-purpose large-scale dataset for solving multi-view geometry related problems. Except for the 113 scenes in BlendedMVS dataset, we follow its blending procedure to generate 389 more scenes (originally shown in GL3D) for BlendedMVG. The training image number is increased from 17k to over 110k.

BlendedMVG and its preceding works (BlendedMVS and GL3D) have been applied to several key 3D computer vision tasks, including image retrieval, image feature detection and description, two-view outlier rejection and multi-view stereo. If you find BlendedMVS or BlendedMVG useful for your research, please cite:

@article{yao2020blendedmvs,
  title={BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks},
  author={Yao, Yao and Luo, Zixin and Li, Shiwei and Zhang, Jingyang and Ren, Yufan and Zhou, Lei and Fang, Tian and Quan, Long},
  journal={Computer Vision and Pattern Recognition (CVPR)},
  year={2020}
}

License

BlendedMVS and BlendedMVG are licensed under a Creative Commons Attribution 4.0 International License!!!

Download

For MVS networks, BlendedMVG is preprocessed and split into 3 smaller subsets (BlendedMVS, BlendedMVS+ and BlendedMVS++):

Dataset	Resolution (768 x 576)	Resolution (2048 x 1536)	Supplementaries
BlendedMVS	low-res set (27.5 GB)	high-res set (156 GB)	textured meshes (9.42 GB), other images (7.56 GB)
BlendedMVS+	low-res set (81.5 GB)	-	-
BlendedMVS++	low-res set (80.0 GB)	-	-

Experiments in BlendedMVS paper were conducting using the BlendedMVS low-res-dataset. In most cases, the low-res dataset would be enough.

Dataset Structure

BlendedMVS(G) dataset adopts MVSNet input format. Please structure your dataset as listed below after downloading the whole dataset:

DATA_ROOT                 
├── BlendedMVG_list.txt                
├── BlendedMVS_list.txt                 
├── BlendedMVS+_list.txt                
├── BlendedMVS++_list.txt              
├── ...
├── PID0                        
│   ├── blended_images          
│   │	├── 00000000.jpg        
│   │	├── 00000000_masked.jpg        
│   │	├── 00000001.jpg        
│   │	├── 00000001_masked.jpg        
│   │	└── ...                 
│   ├── cams                      
│   │  	├── pair.txt           
│   │  	├── 00000000_cam.txt    
│   │  	├── 00000001_cam.txt    
│   │  	└── ...                 
│   └── rendered_depth_maps     
│      	├── 00000000.pfm        
│     	├── 00000001.pfm        
│     	└── ...                    
├── PID1                        
├── ...                         
└── PID501

PID here is the unique project ID listed in the BlendedMVG_list.txt file. We provide blended images with and without masks. For detailed file formats, please refer to MVSNet.

What you can do with BlendedMVS(G)?

Please refer to following repositories on how to apply BlendedMVS(G) on multi-view stereo and feature detector/descriptor networks:

Tasks	Repositories
Multi-view stereo	MVSNet & R-MVSNet
Descriptors & Detectors	GL3D & ASLFeat & ContextDesc & GeoDesc

Except for the above tasks, we believe BlendedMVS(G) could also be applied to a variety of geometry related problems, including, but not limited to:

Sparse outlier rejection (OANet, tested with the original GL3D)
Image retrieval (MIRorR, tested with the original GL3D)
Single-view depth/normal estimation
Two-view disparity estimation
Single/multi-view camera pose regression

Feel free to modify the dataset and adjust to your own tasks!

Note

Online augmentation should be implemented by users themselves. An example for tensorflow users could be found in MVSNet. An example for pytorch users could be found in CasMVSNet_pl
The number of selected source images for a given reference image might be smaller than 10 (when parsing pair.txt).
The depth_min and depth_max in ground truth cameras might be smaller or equal to zero (very few, when parsing *_cam.txt).
The rendered depth map and blended images might be empty as the textured mesh model is not necessarily to be complete (when dealing with *.pfm and *.jpg files).

HanRui56/BlendedMVS