/GFM

[IJCV 2021] Bridging Composite and Real: Towards End-to-end Deep Image Matting

Primary LanguagePython

Bridging Composite and Real: Towards End-to-end Deep Image Matting [IJCV-2021]

This is the official repository of the paper Bridging Composite and Real: Towards End-to-end Deep Image Matting.

Jizhizi Li1∗, Jing Zhang1∗, Stephen J. Maybank2, and Dacheng Tao1
1 The University of Sydney, Sydney, Australia; 2 Birkbeck College, University of London, U.K.
IJCV 2021 (arXiv 2010.16188)

Google Colab Demo | Introduction | GFM | AM-2k | BG-20k | Results Demo | Installation | Inference Code | Statement


🚀 News

The training code and the rest pretrained models will be released very soon.

[2021-10-22]: The paper has been accepted by the International Journal of Computer Vision (IJCV)! 🎉

[2021-09-21]: The datasets AM-2k and BG-20k can now be openly accessed from the links below! Please follow the dataset release agreements to access. You can also refer to sections AM-2k and BG-20k for more details.

Dataset Dataset Link (Google Drive) Dataset Release Agreement
AM-2k Link Agreement (MIT License)
BG-20k Link Agreement (MIT License)

[2020-11-17]: Create Google Colab demo to benefit users who want to have a try online.

[2020-11-03]: Publish the inference code and a pretrained model that can be used to test on your own animal images.

[2020-10-27]: Publish a video demo (YouTube | Google drive) contains motivation, network, datasets, and test results on an animal video.

Demo on Google Colab

For those who do not have GPUs in their environment or only want to have a simple try online, you can try our Google Colab demo to generate the results for your images easily.

Introduction

This repository contains the code, datasets, models, test results and a video demo for the paper Bridging Composite and Real: Towards End-to-end Deep Image Matting. We propose a novel Glance and Focus Matting network (GFM), which employs a shared encoder and two separate decoders to learn both tasks in a collaborative manner for end-to-end image matting. We also establish a novel Animal Matting dataset (AM-2k) to serve for end-to-end matting task. Furthermore, we investigate the domain gap issue between composition images and natural images systematically, propose a carefully designed composite route RSSN and a large-scale high-resolution background dataset (BG-20k) to serve as better candidates for composition.

Here is a video demo to illustrate the motivation, the network, the datasets, and the test results on an animal video.

We have released the inference code, a pretrained model and the Google Colab demo, which can be found in section inference code for more details. We have also published dataset AM-2k and BG-20k, please follow the guidance in section AM-2k and BG-20k to access. Since the paper is currently under review, the training code and the rest pretrained models will be made public after review.

GFM

The architecture of our proposed end-to-end method GFM is illustrated below. We adopt three kinds of Representation of Semantic and Transition Area (RoSTa) -TT, -FT, -BT within our method.

We trained GFM with three backbones, -(d) (DenseNet-121), -(r) (ResNet-34), and -(r2b) (ResNet-34 with 2 extra blocks). The trained model for each backbone can be downloaded via the link listed below.

GFM(d)-TT GFM(r)-TT GFM(r2b)-TT
coming soon coming soon model

AM-2k

Our proposed AM-2k contains 2,000 high-resolution natural animal images from 20 categories along with manually labeled alpha mattes. Some examples are shown as below, more can be viewed in the video demo.

AM-2k can be accessed from here, please make sure that you have read this agreement before accessing the dataset. For more details about the dataset, please refer to the readme.txt in the dataset folder.

BG-20k

Our proposed BG-20k contains 20,000 high-resolution background images excluded salient objects, which can be used to help generate high quality synthetic data. Some examples are shown as below, more can be viewed in the video demo.

BG-20k can be accessed from here, please make sure that you have read this agreement before accessing the dataset. For more details about the dataset, please refer to the readme.txt in the dataset folder.

Results Demo

We test GFM on our AM-2k test dataset and show the results as below. More results on AM-2k test set can be found here.

Installation

Requirements:

  • Python 3.6.5+ with Numpy and scikit-image
  • Pytorch (version 1.4.0)
  • Torchvision (version 0.5.0)
  1. Clone this repository

    git clone https://github.com/JizhiziLi/GFM.git

  2. Go into the repository

    cd GFM

  3. Create conda environment and activate

    conda create -n gfm python=3.6.5

    conda activate gfm

  4. Install dependencies, install pytorch and torchvision separately if you need

    pip install -r requirements.txt

    conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch

Our code has been tested with Python 3.6.5, Pytorch 1.4.0, Torchvision 0.5.0, CUDA 10.1 on Ubuntu 18.04.

Inference Code - How to Test on Your Images

Here we provide the procedure of testing on sample images by our pretrained models:

  1. Download pretrained models as shown in section GFM, unzip to folder models/

  2. Save your high-resolution sample images in folder samples/original/.

  3. Setup parameters in scripts/deploy_samples.sh and run it

    chmod +x scripts/*

    ./scripts/deploy_samples.sh

  4. The results of alpha matte and transparent color image will be saved in folder samples/result_alpha/. and samples/result_color/.

We show some sample images from the internet, the predicted alpha mattes, and their transparent results as below. (We adopt arch='e2e_resnet34_2b_gfm_tt' and use hybrid testing strategy.)

Statement

If you are interested in our work, please consider citing the following:

@article{li2021matting,
  title={Bridging Composite and Real: Towards End-to-end Deep Image Matting},
  author={Li, Jizhizi and Zhang, Jing and Maybank, Stephen J and Tao, Dacheng},
  journal={International Journal of Computer Vision},
  publisher={Springer},
  ISSN={1573-1405},
  year={},
  pages={}
}

This project is under the MIT license. For further questions, please contact Jizhizi Li at jili8515@uni.sydney.edu.au.

Relevant Projects

[1] Deep Automatic Natural Image Matting, IJCAI, 2021 | Paper | Github
     Jizhizi Li, Jing Zhang, and Dacheng Tao

[2] Privacy-Preserving Portrait Matting, ACM MM, 2021 | Paper | Github
     Jizhizi Li, Sihan Ma, Jing Zhang, and Dacheng Tao