Indoor SfMLearner
PyTorch implementation of our ECCV2020 paper:
P2Net: Patch-match and Plane-regularization for Unsupervised Indoor Depth Estimation
Zehao Yu*, Lei Jin*, Shenghua Gao
(* Equal Contribution)
Getting Started
Installation
pip install -r requirements.txt
Then install pytorch with
conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
Pytorch version >= 0.4.1 would work well.
Download pretrained model
Please download pretrained model from Onedrive and extract:
tar -xzvf ckpts.tar.gz
rm ckpts.tar.gz
Prediction on single image
Run the following command to predict on a single image:
python inference_single_image.py --image_path=/path/to/image
By default, the script saves the predicted depth to the same folder
Evaluation
Download testing data from Onedrive and put to ./data.
cd data
tar -xzvf nyu_test.tar.gz
tar -xzvf scannet_test.tar.gz
tar -xzvf scannet_pose.tar.gz
cd ../
NYUv2 Dpeth
CUDA_VISIBLE_DEVICES=1 python evaluation/nyuv2_eval_depth.py \
--data_path ./data \
--load_weights_folder ckpts/weights_5f \
--post_process
NYUv2 normal
CUDA_VISIBLE_DEVICES=1 python evaluation/nyuv2_eval_norm.py \
--data_path ./data \
--load_weights_folder ckpts/weights_5f \
# --post_process
ScanNet Depth
CUDA_VISIBLE_DEVICES=1 python evaluation/scannet_eval_depth.py \
--data_path ./data/scannet_test \
--load_weights_folder ckpts/weights_5f \
--post_process
ScanNet Pose
CUDA_VISIBLE_DEVICES=1 python evaluation/scannet_eval_pose.py \
--data_path ./data/scannet_pose \
--load_weights_folder ckpts/weights_5f \
--frame_ids 0 1
Training
First download NYU Depth V2 on the official website and unzip the raw data to DATA_PATH.
Extract Superpixel
Run the following command to extract superpixel:
python extract_superpixel.py --data_path DATA_PATH --output_dir ./data/segments
3-frames
Run the following command to train our network:
CUDA_VISIBLE_DEVICES=1 python train_geo.py \
--model_name 3frames \
--data_path DATA_PATH \
--val_path ./data \
--segment_path ./data/segments \
--log_dir ./logs \
--lambda_planar_reg 0.05 \
--batch_size 12 \
--scales 0 \
--frame_ids_to_train 0 -1 1
5-frames
Using the pretrained model from 3-frames setting gives better results.
CUDA_VISIBLE_DEVICES=1 python train_geo.py \
--model_name 5frames \
--data_path DATA_PATH \
--val_path ./data \
--segment_path ./data/segments \
--log_dir ./logs \
--lambda_planar_reg 0.05 \
--batch_size 12 \
--scales 0 \
--load_weights_folder FOLDER_OF_3FRAMES_MODEL \
--frame_ids_to_train 0 -2 -1 1 2
Acknowledgements
This project is built upon Monodepth2. We thank authors of Monodepth2 for their great work and repo.
License
TBD
Citation
Please cite our paper for any purpose of usage.
@inproceedings{IndoorSfMLearner,
author = {Zehao Yu and Lei Jin and Shenghua Gao},
title = {P$^{2}$Net: Patch-match and Plane-regularization for Unsupervised Indoor Depth Estimation},
booktitle = {ECCV},
year = {2020}
}