Semantic-Segmentation

Segmentation

Segmentation can be viewed as pixel classification, whereas for each pixel of image we must predict its class (background being one of the classes). There are two main segmentation algorithms:

Semantic segmentation only tells the pixel class, and does not make a distinction between different objects of the same class.
Instance segmentation divides classes into different instances.

For instance segmentation, these sheep are different objects, but for semantic segmentation all sheep are represented by one class.

There are different neural architectures for segmentation, but they all have the same structure. In a way, it is similar to the autoencoder you learned about previously, but instead of deconstructing the original image, our goal is to deconstruct a mask. Thus, a segmentation network has the following parts:

Encoder extracts features from input image
Decoder transforms those features into the mask image, with the same size and number of channels corresponding to the number of classes.

SegNet

Simple encoder - decoder architecture with convolutions, poolings in encoder and convolutions, upsamplings in decoder.

U-Net

Very simple architecture that uses skip connections. Skip connections at each convolution level helps network doesn't lost information about features from original input at this level.

U-Net usually has a default encoder for feature extraction, for example resnet50.

basel-ay/Semantic-Segmentation

Semantic-Segmentation

Segmentation

SegNet

U-Net