Segmentation-Series-Chaos

Summary includes basic segmentation, human segmentation, human or portrait matting for both image and video. Maybe it is a little chaos, so I called it Segmentation-Series-Chaos. If you want a clear understanding, feel free to fork and modify.

[done] Summary of 2019 Survey on semantic segmentation using deep learning techniques_Neurocomputing and other useful sights
[done] Semantic segmentation research in CVPR2019
[done] matting in detail
[done] Focus on Deeplab-research
[doing] experiments
- MMnet experiment
- Deeplabv3+matte: transfer deeplabv3+ to matting task experiment

Summary of 2019 Survey on semantic segmentation using deep learning techniques_Neurocomputing and other useful sights

model／year	para	infer time (ms)	flops	accuracy (VOC2012 /COCO /Cityscapes : %)	paper	code	more
FCN-8s/2015	～134M	175	-	67.20/-/65.30	Fully Convolutional Networks for Semantic Segmentation	https://github.com/shelhamer/fcn.berkeleyvision.org	Begin of FCN for seg, arbitrary input size
PSPNet／2017	65.7M	-	-	85.40/-/80.20	Pyramid Scene Parsing Network	https://github.com/hszhao/PSPNet	multi-scale feature ensembling, pyramid pooling module
DeepLab V3-JFT more pre-trained JFT-300／2017				86.9/-/-	Rethinking Atrous Convolution for Semantic Image Segmentation	https://github.com/rishizek/tensorflow-deeplab-v3	～
DeepLab V3/2017				85.7/-/81.3	Rethinking Atrous Convolution for Semantic Image Segmentation	https://github.com/rishizek/tensorflow-deeplab-v3	Fully connected conditional random fields (CRF),
DeepLab V3+Xception/2018				87.8/-/82.1	Encoder-decoder with atrous separable convolution for semantic image segmentation	https://github.com/fyu/dilation	backbone Xception, encoded-decond based V3, apply depth-wise conv
DeepLab V3+Xception-JFT/2018				89.0/-/-	Encoder-decoder with atrous separable convolution for semantic image segmentation	https://github.com/fyu/dilation	～
ESPNet/2018	0.364M			63.01/-/60.2	SPNet-Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation	https://github.com/sacmehta/ESPNet	point-wise convo (reduce the complexity) , spatial pyramid of dilated conv (provid large receptive field),Hierarchical feature fusion (HFF)
FC-DRN-P-D + ST/2018	3.9M			CamVid:69.4	On the iterative refinement of densely connected representation levels for semantic segmentation	https://github.com/ArantxaCasanova/fc-drn	Combine FC-ResNet and FC-DenseNet
ERFNet/2018	~ 2.1M	24		-/-/69.7	ERFNet: Efficient Residual Factorized ConvNet for Real-time Semantic Segmentation	https://github.com/Eromera/erfnet	bottleneck-1D (non-bt-1D) layer and combines with bottleneck designs in a way that best leverages their learning performance and efficiency
RefineNet/2017				83.40/-/73.60	RefineNet-Multi-Path Refinement Networks for High-Resolution Semantic Segmentation	https://github.com/guosheng/refinenet	Residual conv unit (RCU), Multi-resolution fusion and Chained residual pooling, Muti-path net refines low-resolution features with concentrated low-level features in a recursive manner
FastFCN/2019				Pascal Context: 53.1, ADE20K: 44.34	FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation	https://github.com/wuhuikai/FastFCN	Joint Pyramid Upsampling (JPU)
Fast-SCNN/2019	1.11M			-/-/68.0	Fast-SCNN: Fast Semantic Segmentation Network	https://github.com/kshitizrimal/Fast-SCNN	mobileNetv2, learn to downsample module, depth-wise conv
efficient			3.05G 640 × 360 × 3	-/-/70.33	An efficient solution for semantic segmentation: ShuffleNet V2 with atrous separable convolutions	https://github.com/sercant/mobile-segmentation	TF-lite applied, shuffleNetv2 as feature extraction, deeplabv3 as encode, (mobileNetv2) DPC

An efficient solution for semantic segmentation: ShuffleNet V2 with atrous separable convolutions

In term of VOC and cityscapes, deeplab V3/V3+ is the best from the related leaderboarder: VOC2012 , Cityspaces and https://paperswithcode.com/task/semantic-segmentation
Good advice of mobile devices: less than 2 GFLOPs from AI in RTC challenge group.
Google‘s solution in Mobile Real-time Video Segmentation

from 视频分割在移动端的算法进展综述 * includeing some other method
Greate tools for implementing segmentation model easily : Semantic Segmentation Suite in TensorFlow. Implement, train, and test new Semantic Segmentation models easily!
National University of Singapore and Best Student Paper Award at ACM MM 2018 about multi-human-parsing Official Repository for Multi-Human-Parsing (MHP)
Similar project in GitHub about human segmetation: Human-Segmentation-PyTorch
A nearest project&paper produced by Alimama called Semantic_Human_Matting (SHM) paper in ACMMM. SHM is the first algorithm that learns to jointly fit both semantic information and high quality details with deep networks. (alpha matte)

And one of the human matting datasets: Human Matting datasets

And another useful repo for mobile devices with NCNN tool: And mobile_phone_human_matting (including datasets )

Another latest or s-o-t-a paper in matting:
- A Late Fusion CNN for Digital Matting, CVPR2019.
- Inductive Guided Filter: Real-time Deep Image Matting with Weakly Annotated Masks on Mobile Device, arXiv 2019.
- 2016_Automatic Portrait Segmentation for Image Stylization_CGF
- 2017_Deep Image Matting_CVPR
- 2017_Fast Deep Matting for Portrait Animation on Mobile Phone_ACMMM, github-pytorch
The largest and popular collection of semantic segmentation: awesome-semantic-segmentation which includes many useful resources e.g. architecture, benchmark, datasets, results of related challenge, projects et.al.
A blog conclusion about image semantic segmentation Review of Deep Learning Algorithms for Image Semantic Segmentation

Updated from 20190710:

Latested lightweight model maybe useful: mobileNetV3 (First Submitted on 6 May 2019) and efficientNet (First Submitted on 28 May 2019) using NAS (Neural Architectures Search) techs.
An useful algorithm CVPR2019 about how to use knowledge distillation to improve accuracy of lightweight semantic segmentation models without increasing the params size and GFlops: Structured Knowledge Distillation for Semantic Segmentation proposed by microsoft research asia.
New upsampling method called DUpsample: the W can be learned and a speciall feature fusion tech like inverted fusion decreases the compuation greatly. It outperform deeplabv3+ but only 30% computation. Decoders Matter for Semantic Segmentation: Data-Dependent Decoding Enables Flexible Feature Aggregation CVPR2019
Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation
Espnetv2: A light-weight, power efficient, and general purpose convolutional neural network

Semantic segmentation research in CVPR2019

model	para	Infer time (ms)	GFlops	accuracy (VOC2012 /COCO /Cityscapes %)	paper	code	more
DFANet	7.8M	10	3.4G (input 1024x 1024)	-/-/71.3 CamVid: 64.7	DFANet：Deep Feature Aggregation for Real-Time Semantic Segmentation	https://github.com/Tramac/awesome-semantic-segmentation-pytorch	Proposed by Beijing Megvii Co., Ltd, deep feature aggregation
Auto-DeepLab	44.42M			85.6/-/82.1	Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation	https://github.com/tensorflow/models/tree/master/research/deeplab	NAS, less computa-tion than deeplap, Li feifei, TensorFLow applied, oral
ESPnetV2	~ 6M			68.0/-/66.2	Espnetv2: A light-weight, power efficient, and general purpose convolutional neural network	https://github.com/sacmehta/ESPNetv2	ESPNet (ECCV 2018), group conv to reduce dimension, depth-wise separable atrous conv
Improving				-/-/83.5 CamVid: 81.7	Improving Semantic Segmentation via Video Propagation and Label Relaxation	https://nv-adlr.github.io/publication/2018-Segmentation	video ,oral, a video predict method to enhance seg

HymEric/Segmentation-Series-Chaos

Segmentation-Series-Chaos

Summary includes basic segmentation, human segmentation, human or portrait matting for both image and video. Maybe it is a little chaos, so I called it Segmentation-Series-Chaos. If you want a clear understanding, feel free to fork and modify.

Summary of 2019 Survey on semantic segmentation using deep learning techniques_Neurocomputing and other useful sights

Updated from 20190710:

Semantic segmentation research in CVPR2019