/UniversalRepresentations

Universal Representations: A Unified Look at Multiple Task and Domain Learning

Primary LanguagePythonMIT LicenseMIT

Universal Representations

We propose a Universal Representation Learning framework in (a) that generalizes over multi-task dense prediction tasks (b), multi-domain many-shot learning (c), cross-domain few-shot learning (d) by distilling knowledge of multiple task/domain-specific networks into a single deep neural network after aligning its representations with the task/domain-specific ones through small capacity adapters.

Figure 1. Universal Representation Learning.

Universal Representations: A Unified Look at Multiple Task and Domain Learning,
Wei-Hong Li, Xialei Liu, Hakan Bilen,
IJCV 2023 (arXiv 2204.02744)

Universal Representation Learning from Multiple Domains for Few-shot Classification,
Wei-Hong Li, Xialei Liu, Hakan Bilen,
ICCV 2021 (arXiv 2103.13841)

Knowledge distillation for multi-task learning,
Wei-Hong Li, Hakan Bilen,
ECCV Workshop 2020 (arXiv 2007.06889)

Updates

Features at a glance

  • We propose a unified look at jointly learning multiple vision tasks and visual domains through universal representations, a single deep neural network.

  • We propose distilling knowledge of multiple task/domain-specific networks into a single deep neural network after aligning its representations with the task/domain-specific ones through small capacity adapters.

  • We rigorously show that universal representations achieve state-of-the-art performances in learning of multiple dense prediction problems in NYU-v2 and Cityscapes, multiple image classification problems from diverse domains in Visual Decathlon Dataset and cross-domain few-shot learning in MetaDataset.

Main Results

Multi-task Learning on NYU-v2 with SegNet

Table 1. Testing Results on NYU-v2.

Multi-domain Learning on Visual Decathlon with ResNet-26

Table 2. Testing Results on Visual Decathlon.

Cross-domain Few-shot Learning on Meta-Dataset with ResNet-18

Table 3. Testing Results on Meta-Dataset.

Usage

We evaluate our method on NYU-v2 dataset for learning universal representations to jointly perform multiple dense prediction tasks (Semantic Segmentation, Depth Estimation and Surface Normal Estimation) within a single network and compare our method with existing multi-task optimization methods.

Multi-domain Learning on Visual Decathlon

We evaluate our method on Visual Decathlon dataset for learning universal representations over diverse visual domains (i.e. 10 datasets such as ImageNet, UCF101) within a single network and compare our method with multi-domain learning methods.

We evaluate our method on MetaDataset for learning universal representations from multiple diverse visual domains (i.e. 8 datasets such as ImageNet, Birds, Quick Draw) within a single network for Cross-domain Few-shot Learning and compare our method with existing state-of-the-art methods.

Model Zoo

Multi-task Learning on NYU-v2 with SegNet

STL Models | URL Model

Multi-domain Learning on Visual Decathlon with ResNet-26

SDL Models | SDL Models (Train+Val) | URL Model (Train+Val) | URL (Parallel Adapter) Model (Train+Val)

Cross-domain Few-shot Learning on Meta-Dataset with ResNet-18

SDL Models | URL Model

Contact

For any question, you can contact Wei-Hong Li.

Citation

If you use this code, please cite our papers:

@article{li2023Universal,
    author    = {Li, Wei-Hong and Liu, Xialei and Bilen, Hakan},
    title     = {Universal Representations: A Unified Look at Multiple Task and Domain Learning},
    journal   = {International Journal of Computer Vision},
    pages     = {1--25},
    year      = {2023},
    publisher = {Springer}
}

@inproceedings{li2021Universal,
    author    = {Li, Wei-Hong and Liu, Xialei and Bilen, Hakan},
    title     = {Universal Representation Learning From Multiple Domains for Few-Shot Classification},
    booktitle = {IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {9526-9535}
}

@inproceedings{li2020knowledge,
    author    = {Li, Wei-Hong and Bilen, Hakan},
    title     = {Knowledge distillation for multi-task learning},
    booktitle = {European Conference on Computer Vision (ECCV) Workshop},
    year      = {2020},
}