/Monocular-Depth-Estimation

Depth estimation from single-view RGB images

Primary LanguagePythonMIT LicenseMIT

DOI

Perceptual Dense Network for High-Quality Monocular Depth Estimation

Here, we propose an approach that integrates learning low level and high level features to estimate high-quality depth maps from single-view 2-D images

A deep fully convolutional architecture and suitable optimization objectives that minimize a set of per-pixel loss functions and a perceptual loss function, along with augmentation and training strategies has been employed.

Results

Qualitative

Input RGB Image Ground truth depth map Our results

Quantitative

0.852 0.976 0.995 0.122 0.500 0.053

Instructions

Installation

To install, execute

pip install -r requirements.txt
Data

NYU Depth v2 train - (50K images) (4.1 GB)

On extraction, there will be a data folder.

NYU Depth v2 test - (654 images) (1 GB)

On extraction, there will be three .npy files.

Training

The script train.py contains the code for training the model. It can be invoked with the following arguments:

usage: train.py [-h] --train_dir TRAIN_DIR --test_dir TEST_DIR --batch_size
                BATCH_SIZE --checkpoint_dir CHECKPOINT_DIR --epochs EPOCHS
                [--checkpoint CHECKPOINT] [--lr LR]
                [--log_interval LOG_INTERVAL] [--backbone BACKBONE]
                [--test_batch_size TEST_BATCH_SIZE]
                [--perceptual_weight PERCEPTUAL_WEIGHT]

Training of depth estimation model

  -h, --help            show this help message and exit

mandatory arguments:
  --train_dir TRAIN_DIR
                        Train directory path - should contain the 'data'
                        folder
  --test_dir TEST_DIR   Test directory path - should contain 3 files
  --batch_size BATCH_SIZE
                        Batch size to process the train data
  --checkpoint_dir CHECKPOINT_DIR
                        Directory to save checkpoints in
  --epochs EPOCHS       Number of epochs
  
optional arguments:
  --checkpoint CHECKPOINT
                        Model checkpoint path
  --lr LR               Learning rate
  --log_interval LOG_INTERVAL
                        Interval to print the avg. loss and metrics
  --backbone BACKBONE   Model backbone: densenet161 or densenet121
  --test_batch_size TEST_BATCH_SIZE
                        Batch size for frequent testing
  --perceptual_weight PERCEPTUAL_WEIGHT
                        Weight for the perceptual loss

It is advised to run the code on a GPU. The code automatically detects if a GPU is available, and uses it.

Evaluation/Inference

The script evaluate.py contains the code for evaluating the model/for predicting the depth given an image. It can be invoked with the following arguments:


usage: evaluate.py [-h] --model MODEL [--data_dir DATA_DIR] [--img IMG]
                   [--batch_size BATCH_SIZE] [--output_dir OUTPUT_DIR]
                   [--backbone BACKBONE]

Evaluation of depth estimation model on either test data/own images

  -h, --help            show this help message and exit

arguments:
  --model MODEL         Model checkpoint path
  --data_dir DATA_DIR   Test data directory(If evaluation on test data)
  --img IMG             Image path(If evaluation on a single image)
  --batch_size BATCH_SIZE
                        Batch size to process the test data
  --output_dir OUTPUT_DIR
                        Directory to save output depth images
  --backbone BACKBONE   Model backbone - densenet 121 or densenet 161

It is advised to run the code on a GPU. The code automatically detects if a GPU is available, and uses it.

Citation

Please cite the following if you find the code useful in your research:

@misc{Ashok2020,
  author = {Ashok, Arjun},
  title = {Perceptual Dense Network for High-Quality Monocular Depth Estimation},
  year = {2020},
  publisher = {Zenodo},
  doi = {10.5281/zenodo.4041690},
  version = {1.0},
  url = {https://doi.org/10.5281/zenodo.4041690}
}