This is a code of the algorithm described in "MegaDepth: Learning Single-View Depth Prediction from Internet Photos, Z. Li and N. Snavely, CVPR 2018". The code skeleton is based on "https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix". If you use our code or models for academic purposes, please consider citing:
@inproceedings{MDLi18,
title={MegaDepth: Learning Single-View Depth Prediction from Internet Photos},
author={Zhengqi Li and Noah Snavely},
booktitle={Computer Vision and Pattern Recognition (CVPR)},
year={2018}
}
- The code was written in Pytorch 0.2 and Python 2.7, but it should be easy to adapt it to Python 3 and latest Pytorch version if needed.
- You might need skimage, h5py libraries installed for python before running the code.
- Download pretrained models from: http://www.cs.cornell.edu/projects/megadepth/dataset/models/best_generalization_net_G.pth and put it in "checkpoints/test_local/best_generalization_net_G.pth
- In python file "models/HG_model.py", in init function, change to "model_parameters = self.load_network(model, 'G', 'best_generalization')"
- run demo code
python demo.py
You should see an inverse depth prediction saved as demo.png from an original photo demo.jpg. If you want to use RGB maps for visualization, like the figures in our paper, you have to install/run semantic segmentation from https://github.com/kazuto1011/pspnet-pytorch trained on ADE20K to mask out sky, because inconsistent depth prediction of unmasked sky will not make RGB visualization resonable.
- Download MegaDepth V1 dataset from project website: http://www.cs.cornell.edu/projects/megadepth/.
- Download pretrained model (specific for MD dataset) from http://www.cs.cornell.edu/projects/megadepth/dataset/models/best_vanila_net_G.pth and put it in "checkpoints/test_local/best_vanila_net_G.pth"
- Updated: You might also consider downloading extra 4 pretrained models from (See README and our website for explanations): http://www.cs.cornell.edu/projects/megadepth/dataset/models/test_model_1_4.zip
- Download test list files from http://www.cs.cornell.edu/projects/megadepth/dataset/data_lists/test_lists.tar.gz, it should include two folders corresponding to images with landscape and portrait orientations.
- Download precomputed sparse features from http://www.cs.cornell.edu/projects/megadepth/dataset/Megadepth_v1/sparse_features.zip
- To compute scale invarance RMSE on MD testset, change the variable "dataset_root" in python file "rmse_error_main.py" to the root directory of MegaDepth_v1 folder, and change variable "test_list_dir_l" and "test_list_dir_p" to corresponding folder paths of test lists, and run:
python rmse_error_main.py
- To compute Structure from Motion Disagreement Rate (SDR), change the variable "dataset_root" in python file "rmse_error_main.py" to the root directory of MegaDepth_v1 folder, and change variable "test_list_dir_l" and "test_list_dir_p" to corresponding folder paths of test lists, and run:
python SDR_compute.py
- If you want to run our model on arbitrary Internet photos, please download pretrained model from http://www.cs.cornell.edu/projects/megadepth/dataset/models/best_generalization_net_G.pth, which has much better generalization ability to completely unknown scenes (Note: for clarification, this model is used for more general purpose. We trained the network on top of DIW pretrained weights. It may have better performance than what was described in the paper, if you want to compare yours with our method, you might use the models: http://www.cs.cornell.edu/projects/megadepth/).