This is a PyTorch implementation of our paper:
Hierarchical Discrete Distribution Decomposition for Match Density Estimation (CVPR 2019)
Zhichao Yin, Trevor Darrell, Fisher Yu
We propose a framework suitable for learning probabilistic pixel correspondences. It has applications including stereo matching and optical flow, with inherent uncertainty estimation. HD3 achieves state-of-the-art results for both tasks on established benchmarks (KITTI & MPI Sintel).
arxiv preprint: (https://arxiv.org/abs/1812.06264)
This code has been tested with Python 3.6, PyTorch 1.0 and CUDA 9.0 on Ubuntu 16.04.
- Install PyTorch 1.0 and we recommend using anaconda3 for managing the python environment. You can install all the dependencies by the following:
pip install -r requirements.txt
- Download all the relevant datasets including the FlyingChairs dataset, the FlyingThings3D dataset (we use
DispNet/FlowNet2.0 dataset subsets
following the practice of FlowNet 2.0), the KITTI dataset, and the MPI Sintel dataset.
To train a model on a specific dataset, simply run
bash scripts/train.sh
Note the scripts contain several placeholders which you should replace with your customized choices. For instance, you can specify the dataset type (e.g. FlyingChairs) via --dataset_name
, alternate the network architecture via --encoder
and --decoder
, and switch the task (stereo or flow) you solve via --task
. You can also partly load the weights of a pretrained backbone network via --pretrain_base
(download ImageNet pretrained DLA-34 here), or strictly initialize the weights from a pretrained model via --pretrain
.
You can then start a tensorboard session by
tensorboard --logdir=/path/to/log/files --port=8964
and visualize your training progress by accessing https://localhost:8964 on you browser.
- We provide the learning rate schedules and augmentation configurations in all of our experiments. For other detailed hyperparameters, please refer to our paper so as to reproduce our result.
To test a model on a folder of images, please run
bash scripts/test.sh
Please provide the list of image pair names and pass it to --data_list
. This script will generate predictions for every pair of images and save them in the --save_folder
with the same folder hierarchy as input images. You can choose the saved flow format (e.g. png or flo) via --flow_format
. When the folder contains images of different input sizes (e.g. KITTI), please make sure the --batch_size
is 1.
- When the ground truth is available, you can optionally enable the argument
--evaluate
to calculate the End-Point-Error of your predictions. Please make sure the list consists ofimg-name1 img-name2 gtruth-name
in each line.
We provide pretrained models for all of our experiments. To download them, simply run
bash scripts/download_models.sh
The names of the models come in the format of model-name_dataset-names
. Models are named as hd3f/hd3s
for optical flow and stereo matching. A suffix of c
is appended for models with context module. The dataset_names
indicates our dataset schedule for training the model. You should be able to obtain similar results by running the test script we provide.
If you find our work or our repo useful in your research, please consider citing our paper:
@InProceedings{Yin_2019_CVPR,
author = {Yin, Zhichao and Darrell, Trevor and Yu, Fisher},
title = {Hierarchical Discrete Distribution Decomposition for Match Density Estimation},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2019}
}
We thank Houning Hu for making the teaser image, Simon Niklaus for the correlation operator and Clément Pinard for the FlowNet implementation.