
Python3 / PyTorch implementation of the following paper: Fine-grained Semantics-aware Representation Enhancement for Self-supervisedMonocular Depth Estimation. ICCV 2021 (oral)

Primary LanguagePythonMIT LicenseMIT


This is a Python3 / PyTorch implementation of FSRE-Depth, as described in the following paper:

Fine-grained Semantics-aware Representation Enhancement for Self-supervisedMonocular Depth Estimation overview Hyunyoung Jung, Eunhyeok Park and Sungjoo Yoo

ICCV 2021 (oral)

arXiv pdf

The code was implemented based on Monodepth2.


This code was implemented under torch==1.3.0 and torchvision==0.4.1, using two NVIDIA TITAN Xp gpus with distrutibted training. Different version may produce different results.

pip install -r requirements.txt


KITTI Raw Data and pre-computed segmentation images are required for training.

    ├── 2011_09_26/             
    ├── 2011_09_28/                    
    ├── 2011_09_29/
    ├── 2011_09_30/
    ├── 2011_10_03/
    └── segmentation/   # download and unzip "segmentation.zip" 


For training the full model, run the command as below:

CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node 2 --master_port YOUR_PORT_NUMBER train_ddp.py --data_path YOUR_KITTI_DATA_PATH


The ground truth depth maps should be prepared prior to evaluation.

python export_gt_depth.py --data_path YOUR_KITTI_DATA_PATH --split eigen

MODEL_DIR should be configured as below:

    ├── encoder.pth  # required      
    ├── decoder.pth  # required             
    ├── ...

Run the evaluation command.

python evaluate_depth.py --load_weights_folder MODEL_DIR --data_path YOUR_KITTI_DATA_PATH

Download Models

Backbone Input Download AbsRel SqRel Rms RmsLog delta < 1.25 delta < 1.25^2 delta < 1.25^3
ResNet-18 192 x 640 Drive (.zip) 0.105 0.708 4.546 0.182 0.886 0.964 0.983


Please use the following citation when referencing our work:

    author    = {Jung, Hyunyoung and Park, Eunhyeok and Yoo, Sungjoo},
    title     = {Fine-Grained Semantics-Aware Representation Enhancement for Self-Supervised Monocular Depth Estimation},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {12642-12652}