This source code release accompanies the paper
Probabilistic Pixel-Adaptive Refinement Networks
Anne S. Wannenwetsch, Stefan Roth.
In CVPR 2020.
The code in this repository allows to refine outputs of (probabilistic) deep networks with image-adaptive, confidence-aware convolutions. Applications to the tasks of optical flow and semantic segmentation refinement are illustrated.
Contact: Anne Wannenwetsch (anne.wannenwetsch@visinf.tu-darmstadt.de)
The code was tested with Python 3.6, PyTorch 1.0.0 and Cuda 9.0.
Further requirements can be installed with
pip install -r requirements.txt
We further require the code that accompanies the paper
Pixel-Adaptive Convolutional Neural Networks. Hang Su, Varun Jampani, Deqing Sun, Orazio Gallo, Erik Learned-Miller, and Jan Kautz. CVPR 2019
which underlies our probabilistic pixel-adaptive convolutions (PPACs). Please download the corresponding repository from https://github.com/NVlabs/pacnet, e.g. using
git clone https://github.com/NVlabs/pacnet.git
For the application to optical flow, please also download the code of the paper
Hierarchical Discrete Distribution Decomposition for Match Density Estimation. Zhichao Yin, Trevor Darrell, Fisher Yu. CVPR 2019
from https://github.com/ucbdrive/hd3, e.g.
git clone https://github.com/ucbdrive/hd3.git
PPAC refinement requires to apply some small changes to the original HD3 code. Moreover, we adjust the code to allow invalidity masks also for the Sintel dataset. Please apply the provided patch to the downloaded HD3 repository. For instance, you could do the following:
cp ~/ppac_refinement/0001-Apply-PPAC-changes.patch ~/hd3
cd ~/hd3
git am -3 < 0001-Apply-PPAC-changes.patch
Please adapt the paths accordingly if the directories ppac_refinement
and hd3
are not located in your home folder.
Before running the code, make sure to set PYTHONPATH
appropriately,
e.g. by performing the following:
cd ~/ppac_refinement
export PYTHONPATH=$PYTHONPATH:`pwd`/src
export PYTHONPATH=$PYTHONPATH:~/pacnet
export PYTHONPATH=$PYTHONPATH:~/hd3
Again, please adapt the paths if the directories are not located in your home folder.
We provide code for the refinement of two different estimate types: optical flow fields and semantic segmentation maps. Depending on the specified options (see section below), the different networks are built and trained or evaluated.
Sample scripts to illustrate the usage of the training
functions bin/train_flow_refined.py
as well as
bin/train_segmentation_refined.py
and especially the default settings
of the available parameters can be found in the directory scripts
.
Please note that we trained all our networks on a setup using two GPUs
simultaneously.
Moreover, we include a function bin/inference_hd3_refined.py
which
allows to directly estimate (and evaluate) PPAC-HD3 optical flow
given input image pairs and a pre-trained HD3 and PPAC refinement
checkpoint.
For testing purposes, we have included sample images and ground truth
from the Pascal VOC 2012, Sintel and KITTI datasets in
sample_data/images
as well as sample_data/flow
,
sample_data/invalid
and sample_data/segmentation
, respectively.
To test PPAC refinement on these samples, please run
bash scripts/train_refine_sintel.sh
bash scripts/train_refine_kitti.sh
bash scripts/train_refine_pascal.sh
using option --evaluate_only
and specifying an appropriate save folder
with --save_folder
.
One should expect AEE=0.33 for Sintel, AEE=0.55 for the KITTI sample and
mIoU=0.98 on Pascal as performance of the refined estimates.
The training procedure requires as input the saved
estimates of the underlying, task-specific neural networks. Estimates
have to be provided in a directory specified by the parameter
--flow_root
or --logits_root
and should have .npy
format.
For our optical flow experiments, we used different (fine-tuned) HD3
models as described in section 6.1 of our paper which can be downloaded
from https://github.com/ucbdrive/hd3.
For semantic segmentation on Pascal VOC 2012, we applied the checkpoint
xception_coco_voc_trainaug
of DeepLabV3+ which can be found at
https://github.com/qixuxiang/deeplabv3plus/blob/master/g3doc/model_zoo.md.
Sample files for both tasks can be found in /sample_data/inputs_flow
and
/sample_data/inputs_segmentation
.
Please note that PPAC refinement takes network predictions at full
resolution as inputs, i.e. you should save the estimates
after the (bilinear) upsampling step.
For optical flow, one can save the required HD3 flow fields and
probabilities by calling inference_hd3_refined.py
with option
--save_inputs
.
While we created our Sintel and KITTI benchmark uploads with this
method, please note that the underlying flow fields for Tables 1 and 2
of our paper were saved using the function train.py
of the original
HD3 repository. This method applies a different approach to rescale
input images and thus leads to (slightly) different HD3 results.
dataset_name
: Name of dataset used for training, e.g. determines learning rate scheduledata_root
: Root directory of training/validation dataflow/logits_root
: Folder with input flow/segmentationtrain/val_list
: List of samples used for training and validation, respectivelybase_lr
: Learning rate used for all parameters without explicitly defined learning ratepreprocessing_lr
: Learning rate used for guidance and probability branch if specified (otherwisebase_lr
is used)batch_size(_val)
: Batch size used during training/validationepochs
: Total number of training epochssave_folder
: Folder to which summaries, visualizations, refined estimates etc are savedkernel_size_preprocessing/joint
: Kernel size used in preprocessing and combination branch, respectivelydepth_layers_guidance/prob/joint
: List of number of channels used in guidance, probability and combination branch, respectivelyconv_specification
: Determines type of convolutions used in the combination branch (p=PPACs, c=standard convolutions)shared_filters
: Determines if the convolution weight is shared across all channels of the input estimatespretrained_model_refine/model_refine_path
: Path to pretrained PPAC refinement modelevaluate_only
: Perform only one validation path of the provided training procedurevisualize
: Save visualizations of inputs, refined estimates and if available ground truthsave_inputs
: Save HD3 inputs as required during trainingsave_refined
: Save refined estimates
Please note that not all of the above options are applicable to all train/inference methods. To see all available parameters and the corresponding explanations, you can use
python <train_or_inference_method.py> --help
replacing <...>
with the respective training or inference function.
The data splits used in this paper for training, validation and test are the same as in our previous paper Learning Task-Specific Generalized Convolutions in the Permutohedral Lattice and can be found in the corresponding repository at https://github.com/visinf/semantic_lattice/tree/master/experiments/lists.
In the directory checkpoints
, we provide pre-trained PPAC networks for
optical flow and semantic segmentation which underlie the results
presented in Tables 3, 4, and 5 of the main paper.
Please refer to the paper as well as the supplemental material for the
specifics of these networks.
For illustration purposes, we finally include a non-probabilistic PAC
network (PacNetAdvancedNormalization) using our advanced normalization
scheme in src/models_refine/refinement_network.py
. This network is
applicable to data without probabilities. Please note that we
concatenated image guidance data and probabilities for the PAC
experiments in our paper.
If you use our code, please cite our CVPR 2020 paper:
@inproceedings{Wannenwetsch:2020:PPA,
title = {Probabilistic Pixel-Adaptive Refinement Networks},
author = {Anne S. Wannenwetsch and Stefan Roth},
booktitle = {CVPR},
year = {2020}}
Parts of this code are inspired and/or adapted from code available in the following repositories:
- https://github.com/ucbdrive/hd3
- https://github.com/NVlabs/pacnet
- https://github.com/visinf/semantic_lattice
Corresponding files are labeled accordingly.