/segmentation_models.pytorch

Segmentation models with pretrained backbones. PyTorch.

Primary LanguagePythonMIT LicenseMIT

logo
Python library with Neural Networks for Image
Segmentation based on PyTorch.

PyPI version Build Status Generic badge

The main features of this library are:

  • High level API (just two lines to create neural network)
  • 5 models architectures for binary and multi class segmentation (including legendary Unet)
  • 46 available encoders for each architecture
  • All encoders have pre-trained weights for faster and better convergence

Table of content

  1. Quick start
  2. Examples
  3. Models
    1. Architectures
    2. Encoders
  4. Models API
    1. Input channels
    2. Auxiliary classification output
    3. Depth
  5. Installation
  6. Competitions won with the library
  7. License
  8. Contributing

Quick start

Since the library is built on the PyTorch framework, created segmentation model is just a PyTorch nn.Module, which can be created as easy as:

import segmentation_models_pytorch as smp

model = smp.Unet()

Depending on the task, you can change the network architecture by choosing backbones with fewer or more parameters and use pretrainded weights to initialize it:

model = smp.Unet('resnet34', encoder_weights='imagenet')

Change number of output classes in the model:

model = smp.Unet('resnet34', classes=3, activation='softmax')

All models have pretrained encoders, so you have to prepare your data the same way as during weights pretraining:

from segmentation_models_pytorch.encoders import get_preprocessing_fn

preprocess_input = get_preprocessing_fn('resnet18', pretrained='imagenet')

Examples

  • Training model for cars segmentation on CamVid dataset here.
  • Training SMP model with Catalyst (high-level framework for PyTorch), Ttach (TTA library for PyTorch) and Albumentations (fast image augmentation library) - here Open In Colab

Models

Architectures

Encoders

Encoder Weights Params, M
resnet18 imagenet 11M
resnet34 imagenet 21M
resnet50 imagenet 23M
resnet101 imagenet 42M
resnet152 imagenet 58M
resnext50_32x4d imagenet 22M
resnext101_32x8d imagenet
instagram
86M
resnext101_32x16d instagram 191M
resnext101_32x32d instagram 466M
resnext101_32x48d instagram 826M
dpn68 imagenet 11M
dpn68b imagenet+5k 11M
dpn92 imagenet+5k 34M
dpn98 imagenet 58M
dpn107 imagenet+5k 84M
dpn131 imagenet 76M
vgg11 imagenet 9M
vgg11_bn imagenet 9M
vgg13 imagenet 9M
vgg13_bn imagenet 9M
vgg16 imagenet 14M
vgg16_bn imagenet 14M
vgg19 imagenet 20M
vgg19_bn imagenet 20M
senet154 imagenet 113M
se_resnet50 imagenet 26M
se_resnet101 imagenet 47M
se_resnet152 imagenet 64M
se_resnext50_32x4d imagenet 25M
se_resnext101_32x4d imagenet 46M
densenet121 imagenet 6M
densenet169 imagenet 12M
densenet201 imagenet 18M
densenet161 imagenet 26M
inceptionresnetv2 imagenet
imagenet+background
54M
inceptionv4 imagenet
imagenet+background
41M
efficientnet-b0 imagenet 4M
efficientnet-b1 imagenet 6M
efficientnet-b2 imagenet 7M
efficientnet-b3 imagenet 10M
efficientnet-b4 imagenet 17M
efficientnet-b5 imagenet 28M
efficientnet-b6 imagenet 40M
efficientnet-b7 imagenet 63M
mobilenet_v2 imagenet 2M
xception imagenet 22M

Models API

  • model.encoder - pretrained backbone to extract features of different spatial resolution
  • model.decoder - depends on models architecture (Unet/Linknet/PSPNet/FPN)
  • model.segmentation_head - last block to produce required number of mask channels (include also optional upsampling and activation)
  • model.classification_head - optional block which create classification head on top of encoder
  • model.forward(x) - sequentially pass x through model`s encoder, decoder and segmentation head (and classification head if specified)
Input channels

Input channels parameter allow you to create models, which process tensors with arbitrary number of channels. If you use pretrained weights from imagenet - weights of first convolution will be reused for 1- or 2- channels inputs, for input channels > 4 weights of first convolution will be initialized randomly.

model = smp.FPN('resnet34', in_channels=1)
mask = model(torch.ones([1, 1, 64, 64]))
Auxiliary classification output

All models support aux_params parameters, which is default set to None. If aux_params = None than classification auxiliary output is not created, else model produce not only mask, but also label output with shape NC. Classification head consist of GlobalPooling->Dropout(optional)->Linear->Activation(optional) layers, which can be configured by aux_params as follows:

aux_params=dict(
    pooling='avg',             # one of 'avg', 'max'
    dropout=0.5,               # dropout ratio, default is None
    activation='sigmoid',      # activation function, default is None
    classes=4,                 # define number of output labels
)
model = smp.Unet('resnet34', classes=4, aux_params=aux_params)
mask, label = model(x)
Depth

Depth parameter specify a number of downsampling operations in encoder, so you can make your model lighted if specify smaller depth.

model = smp.Unet('resnet34', encoder_depth=4)

Installation

PyPI version:

$ pip install segmentation-models-pytorch

Latest version from source:

$ pip install git+https://github.com/qubvel/segmentation_models.pytorch

Competitions won with the library

Segmentation Models package is widely used in the image segmentation competitions. Here you can find competitions, names of the winners and links to their solutions.

License

Project is distributed under MIT License

Contributing

Run test
$ docker build -f docker/Dockerfile.dev -t smp:dev . && docker run --rm smp:dev pytest -p no:cacheprovider
Generate table
$ docker build -f docker/Dockerfile.dev -t smp:dev . && docker run --rm smp:dev python misc/generate_table.py