PyTorch Extension Library of Optimized Scatter Operations

Primary LanguagePythonMIT LicenseMIT

PyTorch Scatter

PyPI Version Testing Status Linting Status Docs Status Code Coverage


This package consists of a small extension library of highly optimized sparse update (scatter and segment) operations for the use in PyTorch, which are missing in the main package. Scatter and segment operations can be roughly described as reduce operations based on a given "group-index" tensor. Segment operations require the "group-index" tensor to be sorted, whereas scatter operations are not subject to these requirements.

The package consists of the following operations with reduction types "sum"|"mean"|"min"|"max":

In addition, we provide the following composite functions which make use of scatter_* operations under the hood: scatter_std, scatter_logsumexp, scatter_softmax and scatter_log_softmax.

All included operations are broadcastable, work on varying data types, are implemented both for CPU and GPU with corresponding backward implementations, and are fully traceable.



Update: You can now install pytorch-scatter via Anaconda for all major OS/PyTorch/CUDA combinations 🤗 Given that you have pytorch >= 1.8.0 installed, simply run

conda install pytorch-scatter -c pyg


We alternatively provide pip wheels for all major OS/PyTorch/CUDA combinations, see here.

PyTorch 2.3

To install the binaries for PyTorch 2.3.0, simply run

pip install torch-scatter -f https://data.pyg.org/whl/torch-2.3.0+${CUDA}.html

where ${CUDA} should be replaced by either cpu, cu118, or cu121 depending on your PyTorch installation.

cpu cu118 cu121
Linux ✅ ✅ ✅
Windows ✅ ✅ ✅
macOS ✅

PyTorch 2.2

To install the binaries for PyTorch 2.2.0, simply run

pip install torch-scatter -f https://data.pyg.org/whl/torch-2.2.0+${CUDA}.html

where ${CUDA} should be replaced by either cpu, cu118, or cu121 depending on your PyTorch installation.

cpu cu118 cu121
Linux ✅ ✅ ✅
Windows ✅ ✅ ✅
macOS ✅

Note: Binaries of older versions are also provided for PyTorch 1.4.0, PyTorch 1.5.0, PyTorch 1.6.0, PyTorch 1.7.0/1.7.1, PyTorch 1.8.0/1.8.1, PyTorch 1.9.0, PyTorch 1.10.0/1.10.1/1.10.2, PyTorch 1.11.0, PyTorch 1.12.0/1.12.1, PyTorch 1.13.0/1.13.1, PyTorch 2.0.0/2.0.1, and PyTorch 2.1.0/2.1.1/2.1.2 (following the same procedure). For older versions, you need to explicitly specify the latest supported version number or install via pip install --no-index in order to prevent a manual installation from source. You can look up the latest supported version number here.

From source

Ensure that at least PyTorch 1.4.0 is installed and verify that cuda/bin and cuda/include are in your $PATH and $CPATH respectively, e.g.:

$ python -c "import torch; print(torch.__version__)"
>>> 1.4.0

$ echo $PATH
>>> /usr/local/cuda/bin:...

$ echo $CPATH
>>> /usr/local/cuda/include:...

Then run:

pip install torch-scatter

When running in a docker container without NVIDIA driver, PyTorch needs to evaluate the compute capabilities and may fail. In this case, ensure that the compute capabilities are set via TORCH_CUDA_ARCH_LIST, e.g.:

export TORCH_CUDA_ARCH_LIST = "6.0 6.1 7.2+PTX 7.5+PTX"


import torch
from torch_scatter import scatter_max

src = torch.tensor([[2, 0, 1, 4, 3], [0, 2, 1, 3, 4]])
index = torch.tensor([[4, 5, 4, 2, 3], [0, 0, 2, 2, 1]])

out, argmax = scatter_max(src, index, dim=-1)
tensor([[0, 0, 4, 3, 2, 0],
        [2, 4, 3, 0, 0, 0]])

tensor([[5, 5, 3, 4, 0, 1]
        [1, 4, 3, 5, 5, 5]])

Running tests



torch-scatter also offers a C++ API that contains C++ equivalent of python models. For this, we need to add TorchLib to the -DCMAKE_PREFIX_PATH (e.g., it may exists in {CONDA}/lib/python{X.X}/site-packages/torch if installed via conda):

mkdir build
cd build
# Add -DWITH_CUDA=on support for CUDA support
cmake -DCMAKE_PREFIX_PATH="..." ..
make install