/matting-survey

Deep Image Matting: A Comprehensive Survey

Deep Image Matting: A Comprehensive Survey

This is the official repository of the paper Deep Image Matting: A Comprehensive Survey.

Jizhizi Li, Jing Zhang, and Dacheng Tao1
1 The University of Sydney, Sydney, Australia

Introduction | Preliminary | Methods | Datasets | Benchmark | Statement

Introduction

Image matting refers to extracting precise alpha matte from natural images, and it plays a critical role in various downstream applications, such as image editing. The emergence of deep learning has revolutionized the field of image matting and given birth to multiple new techniques, including automatic, interactive, and referring image matting. Here we present a comprehensive review of recent advancements in image matting in the era of deep learning by focusing on two fundamental sub-tasks: auxiliary input-based image matting.

Preliminary

Image matting, which refers to the precise extraction of the soft matte from foreground objects in arbitrary images, has been extensively studied for several decades. The process can be described mathematically as below, where I represents the input image, F represents the foreground image, and B represents the background image. The opacity of the pixel in the foreground is denoted by αi, which ranges from 0 to 1. We also show the typical input image, ground truth alpha matte and various auxiliary inputs such as trimap, background, coarse map, user clicks, scribbles, and a text description in the following figure. The text description for this image can be the cute smiling brown dog in the middle of the image.

Image Matting Methods

We compile a timeline of the developments in deep learning-based image matting methods as follows.

We also list a summary of image matting methods organized according to the year of publication, the publication venue, input modality, automaticity, matting target, architecture, and availability of the code (with the link). The list of papers is chronologically ordered. Please note that [U] stands for the unofficial implementation of the code.

Year Method Pub. Input Auto. Target Arch. Code
2016 Deep automatic portrait matting (DAPM) ECCV RGB human Sequential two-step CNN -
Natural image matting using deep convolutional neural networks (DCNN) ECCV RGB-Coarse object One-stage CNN -
2017 Deep image matting (DIM) CVPR RGB-Trimap object One-stage CNN+Refine Github[U]
Fast deep matting for portrait animation on mobile phone (FDM) MM RGB human Sequantial two-step CNN -
2018 Tom-Net: Learning transparent object matting from a single image (TOM-Net) CVPR RGB trans. Sequential two-step CNN+Refine Github
Deep propagation based image matting (DMPN) IJCAI RGB-Trimap object One-stage CNN -
Alphagan: Generative adversarial networks for natural image matting (AlphaGAN) BMVC RGB-Trimap object One-stage GAN Github[U]
Semantic soft segmentation (SSS) TOG RGB object Sequential two-stage Github
Semantic human matting (SHM) MM RGB human Sequential two-step CNN Github[U]
Active matting (ActiveMatting) NeurIPS RGB-Click object One-stage RNN -
2019 A late fusion cnn for digital matting (LF) CVPR RGB object Sequential two-stage CNN Github
Learning-based sampling for natural image matting (SampleNet) CVPR RGB-Trimap object Parallel three-stream CNN -
Indices matter: Learning to index for deep image matting (IndexNet) ICCV RGB-Trimap object One-stage CNN Github
Disentangled image matting (AdaMatting) ICCV RGB-Trimap object Parallel two-stream CNN+refine -
Context-aware image matting for simultaneous foreground and alpha estimation (Context-Aware) ICCV RGB-Trimap object Two-stream CNN Github
2020 Natural image matting via guided contextual attention (GCA) AAAI RGB-Trimap object One-stage CNN Github
Background matting: The world is your green screen (BM) CVPR RGB-Bg human Parallel four-stream CNN Github
Hierarchical opacity propagation for image matting (HOP) arXiv RGB-Trimap object Parallel two-stream CNN Github
Boosting semantic human matting with coarse annotations (SHMC) CVPR RGB human Sequential two-stage CNN -
F, b, alpha matting (FBA) arXiv RGB-Trimap object One-stage CNN Github
Attention-guided hierarchical structure aggregation for image matting (HAtt) CVPR RGB object One-stage CNN -
High-resolution deep image matting (HDMatt) AAAI RGB-Trimap object Parallel two-stream CNN -
Bridging composite and real: towards end-to-end deep image matting (GFM) IJCV RGB human, animal Parallel two-stream CNN Github
Modnet: Real-time trimap-free portrait matting via objective decomposition (MODNet) AAAI RGB human Parallel two-stream CNN Github
Learning affinity-aware upsampling for deep image matting(A2U) CVPR RGB-Trimap object One-stage CNN Github
Mask guided matting via progressive refinement network (MGMatting) CVPR RGB-Coarse human One-stage CNN Github
Improved image matting via real-time user clicks and uncertainty estimation (InteractiveMatting) CVPR RGB-Click object Parallel two-stream CNN -
Smart scribbles for image matting (SmartScribbles) TOMM RGB-Scribble object One-stage CNN -
Real-Time High-Resolution Background Matting (BMV2) CVPR RGB-Bg human One-stage CNN+refine Github
2021 Towards enhancing fine-grained details for image matting (FDMatting) WACV RGB-Trimap object Two-stream CNN -
Semantic image matting (SIM) CVPR RGB-Trimap object One-stage CNN Github
Privacy-preserving portrait matting (P3M-Net) MM RGB human Parallel two-stream CNN Github
Cascade image matting with deformable graph refinement (CasDGR) ICCV RGB object Parallel two-stream CNN -
Deep Automatic Natural Image Matting (AIM-Net) IJCAI RGB object Parallel two-stream CNN Github
Long-range feature propagating for natural image matting (LFPNet) MM RGB-Trimap object Parallel two-stream CNN Github
Virtual Multi-Modality Self-Supervised Foreground Matting for Human-Object Interaction (VMFM) ICCV RGB human-object Sequential two-stage CNN -
Tripartite Information Mining and Integration for Image Matting (TIMI-Net) ICCV RGB-Trimap object Parallel three-stream CNN Github
Deep Image Matting with Flexible Guidance Input (FGI) BMVC RGB-Flexible object One-stage CNN Github
Highly efficient natural image matting (HEMatting) BMVC RGB object Sequential two-stage CNN -
2022 Boosting Robustness of Image Matting With Context Assembling and Strong Data Augmentation (Rmat) CVPR RGB-Trimap object Parallel two-stream CNN/Transformer -
Deep interactive image matting with feature propagation (DIIM) TIP RGB-Click object One-stage CNN -
User-Guided Deep Human Image Matting Using Arbitrary Trimaps (UGDMatting) TIP RGB-Flexible human Parallel two-stream CNN -
Image matting with deep gaussian process (matting-GP) TNNLS RGB-Trimap object One-stage CNN -
Rethinking portrait matting with privacy preserving (P3M-ViTAE) IJCV RGB human Parallel two stream CNN/Transformer Github
Situational Perception Guided Image Matting (SPG-IM) MM RGB object Sequential two-stage CNN -
Human instance matting via mutual guidance and multi-instance refinement (HIM) CVPR RGB human Sequential two-stage CNN Github
MatteFormer: Transformer-Based Image Matting via Prior-Tokens (MatteFormer) CVPR RGB-Trimap object One-stage CNN Github
Referring image matting (RIM) CVPR RGB-Language object One-stage CNN Github
TransMatting: Enhancing Transparent Objects Matting with Transformers (TransMatting) ECCV RGB-Trimap trans. One-stage CNN/Transformer Github

Image Matting Datasets

We list a summary of the image matting datasets, categorized as the synthetic image-based benchmark, natural image-based benchmark, and test sets. The datasets are ordered based on their release date and are described in terms of publication venue, naturalness, matting target, resolution, number of training and test samples, and availability (along with their links). It should be noted that the size of the datasets is calculated based on the number of distinguished foregrounds, except for TOM and RefMatte, which have pre-defined composite rules.

Name Pub. Natural Target Resolution #Train #Test Publicity
DIM-481 CVPR'17 object 1298×1083 431 50 Link
TOM CVPR'18 transparent - 178,000 876 Link
LF-257 CVPR'19 human 553×756 228 29 Link
HATT-646 CVPR'20 object 1573×1731 596 60 Link
PhotoMatte13k CVPR'20 human - 13665 - -
SIM CVPR'21 object 2194×1950 348 50 Link
Human-2k ICCV'21 human 2112×2075 2000 100 Link
Trans-460 ECCV'22 transparent 3766×3820 410 50 Link
HIM2k CVPR'22 human 1823×1424 1500 500 Link
RefMatte CVPR'23 object 1543×1162 45000 2500 Link
AlphaMatting CVPR'09 object 3056×2340 27 8 Link
DAPM-2k ECCV'16 human 600×800 1700 300 Link
SHM-35k MM'18 human - 52511 1400 -
SHMC-10k CVPR'20 human - 9324 125 -
P3M-10k MM'21 human 1349×1321 9421 1000 Link
AM-2k IJCV'22 animal 1471×1195 1800 200 Link
Multi-Object-1k MM'22 human-object - 1000 200 -
UGD-12k TIP'22 human 356×317 12066 700 Link
PhotoMatte85 CVPR'20 human 2304×3456 - 85 Link
AIM-500 IJCAI'21 object 1397×1260 - 500 Link
RWP-636 CVPR'21 human 1038×1327 - 636 Link
PPM-100 AAAI'22 human 2997×2875 - 100 Link

Performance Benchmarking

We provide a comprehensive evaluation of representative matting methods in the paper. Here, we present some subjective results of auxiliary-based matting methods on alphamatting.com and automatic matting methods on P3M-500-NP.

Statement

If you are interested in our work, please consider citing the following:

@article{li2023deep,
  title={Deep Image Matting: A Comprehensive Survey},
  author={Jizhizi Li and Jing Zhang and Dacheng Tao},
  journal={ArXiv},
  year={2023},
  volume={abs/2304.04672}
}

This project is under the MIT license. For further questions, please contact Jizhizi Li at jili8515@uni.sydney.edu.au.

Relevant Projects

[1] Deep Automatic Natural Image Matting, IJCAI, 2021 | Paper | Github
     Jizhizi Li, Jing Zhang, and Dacheng Tao

[2] Privacy-preserving Portrait Matting, ACM MM, 2021 | Paper | Github
     Jizhizi Li, Sihan Ma, Jing Zhang, Dacheng Tao

[3] Bridging Composite and Real: Towards End-to-end Deep Image Matting, IJCV, 2022 | Paper | Github
     Jizhizi Li, Jing Zhang, Stephen J. Maybank, Dacheng Tao

[4] Referring Image Matting, CVPR, 2023 | Paper | Github
     Jizhizi Li, Jing Zhang, and Dacheng Tao

[5] Rethinking Portrait Matting with Privacy Preserving, IJCV, 2023 | Paper | Github
     Sihan Ma, Jizhizi Li, Jing Zhang, He Zhang, Dacheng Tao