Unsupervised domain adaptation is a promising technique for semantic segmentation and other computer vision tasks for which large-scale data annotation is costly and time-consuming. In semantic segmentation particularly, it is attractive to train models on annotated images from a simulated (source) domain and deploy them on real (target) domains. In this work, we present a novel framework for unsupervised domain adaptation based on the notion of target-domain consistency training. Intuitively, our work is based on the insight that in order to perform well on the target domain, a model’s output should be consistent with respect to small perturbations of inputs in the target domain. Specifically, we introduce a new loss term to enforce pixelwise consistency between the model's predictions on a target image and perturbed version of the same image. In comparison to popular adversarial adaptation methods, our approach is simpler, easier to implement, and more memory-efficient during training. Experiments and extensive ablation studies demonstrate that our simple approach achieves remarkably strong results on two challenging synthetic-to-real benchmarks, GTA5-to-Cityscapes and SYNTHIA-to-Cityscapes.
- PyTorch (tested on version 1.7.1, but should work on any version)
- Hydra 1.1:
pip install hydra-core --pre
- Other:
pip install albumentations tqdm tensorboard
- WandB (optional):
pip install wandb
We use Hydra for configuration and Weights and Biases for logging. With Hydra, you can specify a config file (found in configs/
) with --config-name=myconfig.yaml
. You can also override the config from the command line by specifying the overriding arguments (without --
). For example, you can disable Weights and Biases with wandb=False
and you can name the run with name=myname
.
We have prepared example configs for GTA5 and SYNTHIA in configs/gta5.yaml
and configs/synthia.yaml
.
To run on GTA5-to-Cityscapes and SYNTHIA-to-Cityscapes, you need to download the respective datasets. Once they are downloaded, you can either modify the config files directly, or organize/symlink the data in the datasets/
directory as follows:
datasets
├── cityscapes
│ ├── gtFine
│ │ ├── train
│ │ │ ├── aachen
│ │ │ └── ...
│ │ └── val
│ └── leftImg8bit
│ ├── train
│ └── val
├── GTA5
│ ├── images
│ ├── labels
│ └── list
├── SYNTHIA
│ └── RAND_CITYSCAPES
│ ├── Depth
│ │ └── Depth
│ ├── GT
│ │ ├── COLOR
│ │ └── LABELS
│ ├── RGB
│ └── synthia_mapped_to_cityscapes
├── city_list
├── gta5_list
└── synthia_list
- For GTA5-to-Cityscapes, we start with a model pretrained on the source (GTA5): Download
- For SYNTHIA-to-Cityscapes, we start with a model pretrained on ImageNet: Download
To run a baseline PixMatch model with standard data augmentations, we can use a command such as:
python main.py --config-name=synthia lam_aug=0.10 name=synthia_baseline
It is also easy to run a model with multiple augmentations:
python main.py --config-name=synthia lam_aug=0.00 lam_fourier=0.10 lam_cutmix=0.10 name=synthia_fourier_and_cutmix
python main.py --config-name=synthia lam_aug=0.10 name=gta5_baseline
To evaluate, simply set the train
argument to False:
python main.py train=False
To evaluate a pretrained/trained model, you can run:
# GTA (default)
CUDA_VISIBLE_DEVICES=3 python main.py train=False wandb=False model.checkpoint=$(pwd)/pretrained/GTA5-to-Cityscapes-checkpoint.pth
# SYNTHIA
CUDA_VISIBLE_DEVICES=3 python main.py --config-name synthia train=False wandb=False model.checkpoint=$(pwd)/pretrained/GTA5-to-Cityscapes-checkpoint.pth
@inproceedings{melaskyriazi2021pixmatch,
author = {Melas-Kyriazi, Luke and Manrai, Arjun},
title = {PixMatch: Unsupervised Domain Adaptation via Pixelwise Consistency Training},
booktitle = cvpr,
year = {2021}
}