/Segmentation-Series-Chaos

Summary and experiment includes basic segmentation, human segmentation, human or portrait matting for both image and video.

Apache License 2.0Apache-2.0

Segmentation-Series-Chaos

Summary includes basic segmentation, human segmentation, human or portrait matting for both image and video. Maybe it is a little chaos, so I called it Segmentation-Series-Chaos. If you want a clear understanding, feel free to fork and modify.

Summary of 2019 Survey on semantic segmentation using deep learning techniques_Neurocomputing and other useful sights

Survey on semantic segmentation using deep learning techniques


model/year para infer time (ms) flops accuracy (VOC2012 /COCO /Cityscapes : %) paper code more
FCN-8s/2015 ~134M 175 - 67.20/-/65.30 Fully Convolutional Networks for Semantic Segmentation https://github.com/shelhamer/fcn.berkeleyvision.org Begin of FCN for seg, arbitrary input size
PSPNet/2017 65.7M - - 85.40/-/80.20 Pyramid Scene Parsing Network https://github.com/hszhao/PSPNet multi-scale feature ensembling, pyramid pooling module
DeepLab V3-JFT more pre-trained JFT-300/2017 86.9/-/- Rethinking Atrous Convolution for Semantic Image Segmentation https://github.com/rishizek/tensorflow-deeplab-v3
DeepLab V3/2017 85.7/-/81.3 Rethinking Atrous Convolution for Semantic Image Segmentation https://github.com/rishizek/tensorflow-deeplab-v3 Fully connected conditional random fields (CRF),
DeepLab V3+Xception/2018 87.8/-/82.1 Encoder-decoder with atrous separable convolution for semantic image segmentation https://github.com/fyu/dilation backbone Xception, encoded-decond based V3, apply depth-wise conv
DeepLab V3+Xception-JFT/2018 89.0/-/- Encoder-decoder with atrous separable convolution for semantic image segmentation https://github.com/fyu/dilation
ESPNet/2018 0.364M 63.01/-/60.2 SPNet-Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation https://github.com/sacmehta/ESPNet point-wise convo (reduce the complexity) , spatial pyramid of dilated conv (provid large receptive field),Hierarchical feature fusion (HFF)
FC-DRN-P-D + ST/2018 3.9M CamVid:69.4 On the iterative refinement of densely connected representation levels for semantic segmentation https://github.com/ArantxaCasanova/fc-drn Combine FC-ResNet and FC-DenseNet
ERFNet/2018 ~ 2.1M 24 -/-/69.7 ERFNet: Efficient Residual Factorized ConvNet for Real-time Semantic Segmentation https://github.com/Eromera/erfnet bottleneck-1D (non-bt-1D) layer and combines with bottleneck designs in a way that best leverages their learning performance and efficiency
RefineNet/2017 83.40/-/73.60 RefineNet-Multi-Path Refinement Networks for High-Resolution Semantic Segmentation https://github.com/guosheng/refinenet Residual conv unit (RCU), Multi-resolution fusion and Chained residual pooling, Muti-path net refines low-resolution features with concentrated low-level features in a recursive manner
FastFCN/2019 Pascal Context: 53.1, ADE20K: 44.34 FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation https://github.com/wuhuikai/FastFCN Joint Pyramid Upsampling (JPU)
Fast-SCNN/2019 1.11M -/-/68.0 Fast-SCNN: Fast Semantic Segmentation Network https://github.com/kshitizrimal/Fast-SCNN mobileNetv2, learn to downsample module, depth-wise conv
efficient 3.05G 640 × 360 × 3 -/-/70.33 An efficient solution for semantic segmentation: ShuffleNet V2 with atrous separable convolutions https://github.com/sercant/mobile-segmentation TF-lite applied, shuffleNetv2 as feature extraction, deeplabv3 as encode, (mobileNetv2) DPC
  • An efficient solution for semantic segmentation: ShuffleNet V2 with atrous separable convolutions

image


Updated from 20190710:

  • Latested lightweight model maybe useful: mobileNetV3 (First Submitted on 6 May 2019) and efficientNet (First Submitted on 28 May 2019) using NAS (Neural Architectures Search) techs.

  • An useful algorithm CVPR2019 about how to use knowledge distillation to improve accuracy of lightweight semantic segmentation models without increasing the params size and GFlops: Structured Knowledge Distillation for Semantic Segmentation proposed by microsoft research asia.

    Structured Knowledge Distillation for Semantic Segmentation

  • New upsampling method called DUpsample: the W can be learned and a speciall feature fusion tech like inverted fusion decreases the compuation greatly. It outperform deeplabv3+ but only 30% computation. Decoders Matter for Semantic Segmentation: Data-Dependent Decoding Enables Flexible Feature Aggregation CVPR2019

    Decoders Matter for Semantic Segmentation-Data-Dependent Decoding Enables Flexible Feature Aggregation

  • Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

    Auto-deeplab

  • Espnetv2: A light-weight, power efficient, and general purpose convolutional neural network

    ESPNetv2

Semantic segmentation research in CVPR2019

model para Infer time (ms) GFlops accuracy (VOC2012 /COCO /Cityscapes %) paper code more
DFANet 7.8M 10 3.4G (input 1024x 1024) -/-/71.3 CamVid: 64.7 DFANet:Deep Feature Aggregation for Real-Time Semantic Segmentation https://github.com/Tramac/awesome-semantic-segmentation-pytorch Proposed by Beijing Megvii Co., Ltd, deep feature aggregation
Auto-DeepLab 44.42M 85.6/-/82.1 Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation https://github.com/tensorflow/models/tree/master/research/deeplab NAS, less computa-tion than deeplap, Li feifei, TensorFLow applied, oral
ESPnetV2 ~ 6M 68.0/-/66.2 Espnetv2: A light-weight, power efficient, and general purpose convolutional neural network https://github.com/sacmehta/ESPNetv2 ESPNet (ECCV 2018), group conv to reduce dimension, depth-wise separable atrous conv
Improving -/-/83.5 CamVid: 81.7 Improving Semantic Segmentation via Video Propagation and Label Relaxation https://nv-adlr.github.io/publication/2018-Segmentation video ,oral, a video predict method to enhance seg