/MTPSL

Learning Multiple Dense Prediction Tasks from Partially Annotated Data - CVPR 2022

Primary LanguagePythonMIT LicenseMIT

Learning Multiple Dense Prediction Tasks from Partially Annotated Data

We propose a more realistic and general setting for multi-task dense prediction problems, called multi-task partially-supervised learning (MTPSL) where not all task labels are available in each training image (Fig. 1(b)), which generalizes over the standard supervised learning (Fig. 1(a)) where all task labels are available. And we propose a novel and architecture-agnostic MTL model that penalizes cross-task consistencies between pairs of tasks in joint pairwise task-spaces, each encoding the commonalities between pairs, in a computationally efficient manner (Fig. 1(c)).

Learning Multiple Dense Prediction Tasks from Partially Annotated Data,
Wei-Hong Li, Xialei Liu, Hakan Bilen,
CVPR 2022 (arXiv 2111.14893)

Updates

  • April'23, Code on Cityscapes is now released! The multi-task partially-supervised learning label split for the PASCAL-context dataset is also released!
  • June'22, The code on NYU-v2 is now released! The rest code will be available soon!
  • June'22, Our paper is listed in CVPR'22 Best Paper Finalists.
  • March'22, Our paper is accepted to CVPR'22!

Features at a glance

  • We propose a more realistic and general setting for multi-task dense prediction problems, called multi-task partially-supervised learning (MTPSL) where not all task labels are available in each training image.

  • We propose a novel and architecture-agnostic MTL model that penalizes cross-task consistencies between pairs of tasks in joint pairwise task-spaces, each encoding the commonalities between pairs, in a computationally efficient manner.

  • We evaluate our method on NYU-v2, Cityscapes, PASCAL-context under different multi-task partially-supervised learning settings and our method obtains superior results than related baselines.

  • Our method applied to standard multi-task learning setting (all tasks labels are available in each training images) by learning cross-task consistency achieves state-of-the-art performance on NYU-v2.

  • See our research page for more details

Requirements

  • Python 3.6+
  • PyTorch 1.8.0 (or newer version)
  • torchvision 0.9.0 (or newer version)
  • progress
  • matplotlib
  • numpy

Prepare dataset

We use the preprocessed NYUv2 dataset and Cityscapes dataset provided by this repo. Download the dataset and place the dataset folder in ./data/

Usage

The easiest way is to download our pre-trained models learned with our proposed cross-task consistency learning and evaluate it on the validation set. To download the pretrained model, one can use gdown (installed by pip install gdown) and execute the following command in the root directory of this project:

gdown https://drive.google.com/uc?id=1s9x8neT9SYR2M6C89CvbeID3XlBRJoEw && md5sum nyuv2_pretrained.zip && unzip nyuv2_pretrained.zip -d ./results/ && rm nyuv2_pretrained.zip
    

This will donwnload the pre-trained models and place them in the ./results directory.

One can evaluate these model by:

CUDA_VISIBLE_DEVICES=<gpu_id> python nyu_eval.py --dataroot ./data/nyuv2 --ssl-type onelabel --model ./results/nyuv2/mtl_xtc_onelabel.pth.tar

Train our method

Training our method with SegNet for multi-task partially-supervised learning settings, e.g. one-label and random-label settings. In one-label setting, i.e. one task label per image, we learn cross-task consistency for multi-task partially-supervised learning:

CUDA_VISIBLE_DEVICES=<gpu_id> python nyu_mtl_xtc.py --out ./results/nyuv2 --ssl-type onelabel --dataroot ./data/nyuv2 

One may train our method that learns cross-task consistency for multi-task learning with full supervision (--ssl-type full):

CUDA_VISIBLE_DEVICES=<gpu_id> python nyu_mtl_xtc.py --out ./results/nyuv2 --ssl-type full --dataroot ./data/nyuv2 

Train supervised learning baselines

  • Train the single-task learning models with SegNet:
CUDA_VISIBLE_DEVICES=<gpu_id> python nyu_stl_sl.py --out ./results/nyuv2 --ssl-type onelabel --dataroot ./data/nyuv2 --task semantic 
  • Train the multi-task supervised learning model with SegNet:
CUDA_VISIBLE_DEVICES=<gpu-id> python nyu_mtl_sl.py --out ./results/nyuv2 --ssl-type onelabel --dataroot ./data/nyuv2

Train on Cityscapes

Similar to experiments on NYUv2, one may train the STL, MTL, and our method on Cityscapes. For example, to train our method on Cityscapes:

CUDA_VISIBLE_DEVICES=<gpu_id> python cityscapes_mtl_xtc.py --out ./results/cityscapes2 --ssl-type onelabel --dataroot ./data/cityscapes2 

Training the provided code on Cityscapes will result different performances than the reported numbers in the paper. But, the rankings stay the same. For comparing models in the paper, please re-run the model with your preferred training strategies (learning rate, optimizer, etc) and keep all training strategies consistent for all compared methods for fair comparison.

Acknowledge

We thank authors of MTAN and Multi-Task-Learning-PyTorch for their source code.

Contact

For any question, you can contact Wei-Hong Li.

Citation

If you use this code, please cite our papers:

@inproceedings{li2022Learning,
    author    = {Li, Wei-Hong and Liu, Xialei and Bilen, Hakan},
    title     = {Learning Multiple Dense Prediction Tasks from Partially Annotated Data},
    booktitle = {IEEE/CVF International Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022}
}