The Minkowski Engine is an auto-differentiation library for sparse tensors. It supports all standard neural network layers such as convolution, pooling, unpooling, and broadcasting operations for sparse tensors. For more information, please visit the documentation page.
The Minkowski Engine supports various functions that can be built on a sparse tensor. We list a few popular network architectures and applications here. To run the examples, please install the package and run the command in the package root directory.
Compressing a neural network to speedup inference and minimize memory footprint has been studied widely. One of the popular techniques for model compression is pruning the weights in convnets, is also known as sparse convolutional networks. Such parameter-space sparsity used for model compression compresses networks that operate on dense tensors and all intermediate activations of these networks are also dense tensors.
However, in this work, we focus on spatially sparse data, in particular, spatially sparse high-dimensional inputs. We can also represent these data as sparse tensors, and these sparse tensors are commonplace in high-dimensional problems such as 3D perception, registration, and statistical data. We define neural networks specialized for these inputs as sparse tensor networks and these sparse tensor networks process and generate sparse tensors as outputs. To construct a sparse tensor network, we build all standard neural network layers such as MLPs, non-linearities, convolution, normalizations, pooling operations as the same way we define them on a dense tensor and implemented in the Minkowski Engine.
We visualized a sparse tensor network operation on a sparse tensor, convolution, below. The convolution layer on a sparse tensor works similarly to that on a dense tensor. However, on a sparse tensor, we compute convolution outputs on a few specified points which we can control in the generalized convolution. For more information, please visit the documentation page on sparse tensor networks and the terminology page.
Dense Tensor | Sparse Tensor |
---|---|
- Unlimited high-dimensional sparse tensor support
- All standard neural network layers (Convolution, Pooling, Broadcast, etc.)
- Dynamic computation graph
- Custom kernel shapes
- Multi-GPU training
- Multi-threaded kernel map
- Multi-threaded compilation
- Highly-optimized GPU kernels
- Ubuntu 14.04 or higher
- CUDA 10.1 or higher
- pytorch 1.3 or higher
- python 3.6 or higher
- GCC 6 or higher
You can install the Minkowski Engine with pip
, with anaconda, or on the system directly. If you experience issues installing the package, please checkout the common compilation issues page or the installation wiki page.
If you cannot find a relevant problem, please report the issue on the github issue page.
The MinkowskiEngine is distributed via PyPI MinkowskiEngine which can be installed simply with pip
.
First, install pytorch following the instruction. Next, install openblas
.
sudo apt install libopenblas-dev
pip3 install torch
pip3 install -U MinkowskiEngine
sudo apt install libopenblas-dev
pip3 install torch
pip3 install -U -I git+https://github.com/StanfordVL/MinkowskiEngine
We recommend python>=3.6
for installation.
First, follow the anaconda documentation to install anaconda on your computer.
conda create -n py3-mink python=3.7
conda activate py3-mink
conda install numpy mkl-include
conda install pytorch -c pytorch
conda activate py3-mink
git clone https://github.com/StanfordVL/MinkowskiEngine.git
cd MinkowskiEngine
python setup.py install
Like the anaconda installation, make sure that you install pytorch with the same CUDA version that nvcc
uses.
# install system requirements
sudo apt install python3-dev libopenblas-dev
# Skip if you already have pip installed on your python3
curl https://bootstrap.pypa.io/get-pip.py | python3
# Get pip and install python requirements
python3 -m pip install torch numpy
git clone https://github.com/StanfordVL/MinkowskiEngine.git
cd MinkowskiEngine
python setup.py install
The Minkowski Engine supports CPU only build on other platforms that do not have NVidia GPUs. Please refer to quick start for more details.
To use the Minkowski Engine, you first would need to import the engine.
Then, you would need to define the network. If the data you have is not
quantized, you would need to voxelize or quantize the (spatial) data into a
sparse tensor. Fortunately, the Minkowski Engine provides the quantization
function (MinkowskiEngine.utils.sparse_quantize
).
import torch.nn as nn
import MinkowskiEngine as ME
class ExampleNetwork(ME.MinkowskiNetwork):
def __init__(self, in_feat, out_feat, D):
super(ExampleNetwork, self).__init__(D)
self.conv1 = nn.Sequential(
ME.MinkowskiConvolution(
in_channels=in_feat,
out_channels=64,
kernel_size=3,
stride=2,
dilation=1,
has_bias=False,
dimension=D),
ME.MinkowskiBatchNorm(64),
ME.MinkowskiReLU())
self.conv2 = nn.Sequential(
ME.MinkowskiConvolution(
in_channels=64,
out_channels=128,
kernel_size=3,
stride=2,
dimension=D),
ME.MinkowskiBatchNorm(128),
ME.MinkowskiReLU())
self.pooling = ME.MinkowskiGlobalPooling()
self.linear = ME.MinkowskiLinear(128, out_feat)
def forward(self, x):
out = self.conv1(x)
out = self.conv2(out)
out = self.pooling(out)
return self.linear(out)
# loss and network
criterion = nn.CrossEntropyLoss()
net = ExampleNetwork(in_feat=3, out_feat=5, D=2)
print(net)
# a data loader must return a tuple of coords, features, and labels.
coords, feat, label = data_loader()
input = ME.SparseTensor(feat, coords=coords)
# Forward
output = net(input)
# Loss
loss = criterion(output.F, label)
For discussion and questions, please use minkowskiengine@googlegroups.com
.
For API and general usage, please refer to the MinkowskiEngine documentation
page for more detail.
For issues not listed on the API and feature requests, feel free to submit an issue on the github issue page.
If you use the Minkowski Engine, please cite:
@inproceedings{choy20194d,
title={4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks},
author={Choy, Christopher and Gwak, JunYoung and Savarese, Silvio},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
pages={3075--3084},
year={2019}
}