Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation

Introduction

This repo is the official implementation of "Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation" on COCO object detection. The code is based on MMDetection v0.6.0.

Deterministic sampling(middle) and stochastic sampling(right) from the raw image(left).

Stochastic sampling-interpolation network(a) and comparison between deterministic sampling(b, left) and stochastic sampling(b, right).

Abstract. In the feature maps of CNNs, there commonly exists considerable spatial redundancy that leads to much repetitive processing. Towards reducing this superﬂuous computation, we propose to compute features only at sparsely sampled locations, which are probabilistically chosen according to activation responses, and then densely reconstruct the feature map with an eﬃcient interpolation procedure. With this sampling-interpolation scheme, our network avoids expending computation on spatial locations that can be eﬀectively interpolated, while being robust to activation prediction errors through broadly distributed sampling. A technical challenge of this sampling-based approach is that the binary decision variables for representing discrete sampling locations are non-diﬀerentiable, making them incompatible with backpropagation. To circumvent this issue, we make use of a reparameterization trick based on the Gumbel-Softmax distribution, with which backpropagation can iterate these variables towards binary values. The presented network is experimentally shown to save substantial computation while maintaining accuracy over a variety of computer vision tasks.

Model Zoo

All the models are based on the Faster R-CNN with FPN.

Backbone	Resolution	Sparse Loss Weight	mAP	GFlops	Config	Model URL	Model sha256sum
ResNet-101	500	-	38.5	70.0	config	model	206b4c0e
ResNet-101	600	-	40.4	100.2	config	model	c4e102de
ResNet-101	800	-	42.3	184.1	config	model	3fc2af7a
ResNet-101	1000	-	43.4	289.5	config	model	e043c999
SparseResNet-101	1000	0.02	43.3	164.8	config	model	16a152e0
SparseResNet-101	1000	0.05	42.7	120.3	config	model	f0a467c8
SparseResNet-101	1000	0.1	41.9	94.4	config	model	1c9bf665
SparseResNet-101	1000	0.2	40.7	71.4	config	model	46044e4a

Getting Started

Requirements

At present, we have not checked the compatibility of the code with other versions of the packages, so we only recommend the following configuration.

Python 3.7
PyTorch == 1.1.0
Torchvision == 0.3.0
CUDA 9.0
Other dependencies

Installation

We recommand using conda env to setup the experimental environments.

# Create environment
conda create -n SAI_Det python=3.7 -y
conda activate SAI_Det

# Install PyTorch & Torchvision
conda install pytorch=1.1.0 cudatoolkit=9.0 torchvision -c pytorch -y

# Clone repo
git clone https://github.com/zdaxie/SpatiallyAdaptiveInference-Detection ./SAI_Det
cd ./SAI_Det

# Create soft link for data
mkdir data
cd data
ln -s ${COCO-Path} ./coco
cd ..

# Install requirements and Compile operators
./init.sh

Running

For now, we only support training with 8 GPUs.

# Test with the given config & model
./tools/dist_test.sh ${config-path} ${model-path} ${num-gpus} --out ${output-file.pkl}

# Train with the given config
./tools/dist_train.sh ${config-path} ${num-gpus}

License

This project is released under the Apache 2.0 license.

Citation

If you use our codebase or models in your research, please cite this project.

@InProceedings{xie2020spatially,
author = {Xie, Zhenda and Zhang, Zheng and Zhu, Xizhou and Huang, Gao and Lin, Steve},
title = {Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation},
booktitle = {European Conference on Computer Vision (ECCV)},
year = {2020},
month = {August},
}