Self-Supervised Monocular Depth Estimation with Self-Reference Distillation and Disparity Offset Refinement
Paper link: https://arxiv.org/abs/2302.09789
conda create -n srd python=3.7
conda activate srd
conda install pytorch==1.9.1 torchvision==0.10.1 torchaudio==0.9.1 -c pytorch
pip install -r requirements.txt
You can download the entire raw KITTI dataset by running:
wget -i splits/kitti_archives_to_download.txt -P kitti_data/
Then unzip with
cd kitti_data
unzip "*.zip"
cd ..
Warning: it weighs about 175GB, so make sure you have enough space to unzip too!
Our default settings expect that you have converted the png images to jpeg with this command, which also deletes the raw KITTI .png
files:
find kitti_data/ -name '*.png' | parallel 'convert -quality 92 -sampling-factor 2x2,1x1,1x1 {.}.png {.}.jpg && rm {}'
or you can skip this conversion step and train from raw png files by adding the flag --png
when training, at the expense of slower load times.
The above conversion command creates images which match our experiments, where KITTI .png
images were converted to .jpg
on Ubuntu 16.04 with default chroma subsampling 2x2,1x1,1x1
.
We found that Ubuntu 18.04 defaults to 2x2,2x2,2x2
, which gives different results, hence the explicit parameter in the conversion command.
You can also place the KITTI dataset wherever you like and point towards it with the --data_path
flag during training and evaluation.
Splits
The train/test/validation splits are defined in the splits/
folder.
By default, the code will train a depth model using Zhou's subset of the standard Eigen split of KITTI, which is designed for monocular training.
You can also train a model using the new benchmark split or the odometry split by setting the --split
flag.
Custom dataset
You can train on a custom monocular or stereo dataset by writing a new dataloader class which inherits from MonoDataset
– see the KITTIDataset
class in datasets/kitti_dataset.py
for an example.
By default models and tensorboard event files are saved to ~/tmp/<model_name>
.
This can be changed with the --log_dir
flag.
Monocular training:
Single GPU:
train.py line9
from trainer_single_gpu import Trainer
CUDA_VISIBLE_DEVICES=0 python train.py --model_name mono_srd
Distributed training:
train.py line9
from trainer import Trainer
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 train.py --model_name mono_srd
To prepare the ground truth depth maps run:
python export_gt_depth.py --data_path kitti_data --split eigen
python export_gt_depth.py --data_path kitti_data --split eigen_benchmark
...assuming that you have placed the KITTI dataset in the default location of ./kitti_data/
.
The following example command evaluates the epoch 19 weights of a model named mono_model
:
python evaluate_depth.py --load_weights_folder ./save_models/mono_model/models/weights_19/ --eval_mono