README

Supervised Depth Estimation

Monocular/stereo depth estimation with regression. Trained and evaluated on the raw KITTY dataset. For training, dense depth maps are generated with a Sparse-to-Dense Network (PyTorch implementation). Evaluation is on sparse depth maps (Eigen split). The following models are available: DispNetS, ResNet autoencoder with 18, 34, 50, 101 and 152 layers.

Loss includes:

regression loss (reversed Huber loss)
smoothing loss
occlusion loss
disparity consistency loss (in stereo mode)

The network was inspired by the following papers:

Dependencies

Python 3
PyTorch 1.0

Training and testing were performed on Ununtu 16.04, Cuda 8.0 and 1080Ti GPU.

Usage

Downloads

Clone the code

git clone https://github.com/victoriamazo/depth_regression.git

Download KITTY raw dataset
Download a pretrained DispNetS model (mono) here
Download a pretrained ResNet34 model (mono) here

Demo

To run demo on a single image (a downloaded model is expected to be in the 'ckpts' directory):

python demo.py ckpts/DispNetS_ckpt_best.pth.tar --arch DispNetS --image images/000011.png

To run the ResNet model, indicate --arch ResNet and --num_layers 34.

Training/testing

All configuration parameters are explained in "config/config_params.md".

Training and testing as parallel threads

python3 main.py config/conv.json

testing

python3 main.py config/conv.json -m test

training

python3 main.py config/conv.json -m train

Results

The following results are from evaluation on the raw KITTY dataset (Eigen split):

Method	Abs Rel	Sq Rel	RMSE	RMSE(log)	δ1	δ2	δ3
DispNetS (mono)	0.1850	0.6659	2.8280	0.2193	0.7064	0.9566	0.9909
ResNet34 (mono)	0.2665	1.1139	3.6226	0.3720	0.4883	0.8079	0.9120

Qualitative results:

             Input                             Ground Truth                           DispnetS

             Input                             Ground Truth                           ResNet34