
Code for "Domain Adaptation for Semantic Segmentation with Maximum Squares Loss" in PyTorch.

Primary LanguagePythonMIT LicenseMIT

Domain Adaptation for Semantic Segmentation with Maximum Squares Loss

By Minghao Chen, Hongyang Xue, Deng Cai.


A PyTorch implementation for our ICCV 2019 paper "Domain Adaptation for Semantic Segmentation with Maximum Squares Loss". The segmentation model is based on Deeplabv2 with ResNet-101 backbone. "MaxSquare+IW+Multi" introduced in the paper achieves competitive result on three UDA datasets: GTA5, SYNTHIA, CrossCity dataset. Moreover, our method achieves the state-of-the-art results in GTA5-to-Cityscapes and Cityscapes-to-CrossCity adaptation.


If you use this code in your research, please cite:

author = {Chen, Minghao and Xue, Hongyang and Cai, Deng},
title = {Domain Adaptation for Semantic Segmentation With Maximum Squares Loss},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}


The code is implemented with Python(3.6) and Pytorch(1.0.0).

Install the newest Pytorch from https://pytorch.org/.

To install the required python packages, run

pip install -r requirements.txt



  • Download GTA5 datasets, which contains 24,966 annotated images with 1914×1052 resolution taken from the GTA5 game. We use the sample code for reading the label maps and a split into training/validation/test set from here. In the experiments, we resize GTA5 images to 1280x720.
  • Download Cityscapes, which contains 5,000 annotated images with 2048 × 1024 resolution taken from real urban street scenes. We resize Cityscapes images to 1024x512 (or 1280x640 which yields sightly better results but costs more time).
  • Download the checkpoint pretrained on GTA5.
  • If you want to pretrain the model by yourself, download the model pretrained on ImageNet.



  • Download NTHU dataset, which consists of images with 2048 × 1024 resolution from four different cities: Rio, Rome, Tokyo, and Taipei. We resize images to 1024x512, the same as Cityscapes.
  • Download the checkpoint pretrained on Cityscapes.

Put all datasets into "datasets" folder and all checkpoints into "pretrained_model" folder.


We present several transfered results reported in our paper and provide the corresponding checkpoints.



Method Source MinEnt MaxSquare MaxSquare+IW MaxSquare+IW+Multi
mIoU(%) 36.9 42.2 44.3 45.2 46.4


Method Source MaxSquare MaxSquare+IW
mIoU(%) 51.0 53.9 54.5


Method Source MaxSquare MaxSquare+IW
mIoU(%) 48.9 52.0 53.3


Method Source MaxSquare MaxSquare+IW
mIoU(%) 47.8 49.7 50.5


Method Source MaxSquare MaxSquare+IW
mIoU(%) 46.3 49.8 50.6



(Optional) Pretrain the model on the source domain (GTA5).

Otherwise, download the checkpoint pretrained on GTA5 in "Setup" section.

python3 tools/train_source.py --gpu "0" --dataset 'gta5' --checkpoint_dir "./log/gta5_pretrain/" --iter_max 200000 --iter_stop 80000 --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --crop_size "1280,720"

Then in next step, set --pretrained_ckpt_file "./log/gta5_pretrain/gta5final.pth".

  • MaxSquare
python3 tools/solve_gta5.py --gpu "0" --backbone "deeplabv2_multi" --dataset 'cityscapes' --checkpoint_dir "./log/gta2city_AdaptSegNet_ST=0.1_maxsquare_round=5/" --pretrained_ckpt_file "./pretrained_model/GTA5_source.pth" --round_num 5 --target_mode "maxsquare" --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --lambda_target 0.1
  • MaxSquare+IW
python3 tools/solve_gta5.py --gpu "0" --backbone "deeplabv2_multi" --dataset 'cityscapes' --checkpoint_dir "./log/gta2city_AdaptSegNet_ST=0.1_IW_maxsquare_round=5/" --pretrained_ckpt_file "./pretrained_model/GTA5_source.pth" --round_num 5 --target_mode "IW_maxsquare" --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --lambda_target 0.1 --IW_ratio 0.2

Pretrain the multi-level model on the source domain (GTA5) by adding "--multi True".

python3 tools/train_source.py --gpu "0" --dataset 'gta5' --checkpoint_dir "./log/gta5_pretrain_multi/" --iter_max 200000 --iter_stop 80000 --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --crop_size "1280,720" --multi True
  • MaxSquare+IW+Multi
python3 tools/solve_gta5.py --gpu "0" --backbone "deeplabv2_multi" --dataset 'cityscapes' --checkpoint_dir "./log/gta2city_AdaptSegNet_ST=0.09_IW_maxsquare_multi_round=5/" --pretrained_ckpt_file "./log/gta5_pretrain_multi/gta5best.pth" --round_num 5 --target_mode "IW_maxsquare" --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --target_crop_size "1280,640" --lambda_target 0.09 --IW_ratio 0.2 --multi True --lambda_seg 0.1 --threshold 0.95


python3 tools/evaluate.py --gpu "0" --dataset 'cityscapes' --checkpoint_dir "./log/eval_city" --pretrained_ckpt_file "./log/gta2city_AdaptSegNet_ST=0.1_maxsquare_round=5/gta52city_maxsquarebest.pth" --image_summary True --flip True

To have a look at predicted examples, run tensorboard as follows:

tensorboard --logdir=./log/eval_city  --port=6009


(Optional) Pretrain the model on the source domain (Cityscapes).

python3 tools/train_source.py --gpu "0" --dataset 'cityscapes' --checkpoint_dir "./log/cityscapes_pretrain_class13/" --iter_max 200000 --iter_stop 80000 --freeze_bn False --weight_decay 5e-4 --lr 2.5e-4 --crop_size "1024,512" --num_classes 13
  • MaxSquare (take "Rome" for example)
python3 tools/solve_crosscity.py --gpu "0" --city_name 'Rome' --source_dataset 'cityscapes' --checkpoint_dir "./log/city2Rome_maxsquare/" --pretrained_ckpt_file "./pretrained_model/Cityscapes_source_class13.pth"  --crop_size "1024,512" --target_crop_size "1024,512"  --epoch_num 10 --target_mode "maxsquare" --lr 2.5e-4 --lambda_target 0.1 --num_classes 13


The structure of this code is largely based on this repo.

Deeplabv2 model is borrowed from Pytorch-Deeplab.