We provide the Pytorch implementation of "The Spatially-Correlative Loss for Various Image Translation Tasks". Based on the inherent self-similarity of object, we propose a new structure-preserving loss for one-sided unsupervised I2I network. The new loss will deal only with spatial relationship of repeated signal, regardless of their original absolute value.
The Spatially-Correlative Loss for Various Image Translation Tasks
Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
NTU and Monash University
In CVPR2021
- a simple example to use the proposed loss
This code was tested with Pytorch 1.7.0, CUDA 10.2, and Python 3.7
- Install Pytoch 1.7.0, torchvision, and other dependencies from http://pytorch.org
- Install python libraries visdom and dominate for visualization
pip install visdom dominate
- Clone this repo:
git clone https://github.com/lyndonzheng/F-LSeSim
cd F-LSeSim
Please refer to the original CUT and CycleGAN to download datasets and learn how to create your own datasets.
- Train the single-modal I2I translation model:
sh ./scripts/train_sc.sh
-
Set
--use_norm
for cosine similarity map, the default similarity is dot-based attention score.--learned_attn, --augment
for the learned self-similarity. -
To view training results and loss plots, run
python -m visdom.server
and copy the URL http://localhost:port. -
Training models will be saved under the checkpoints folder.
-
The more training options can be found in the options folder.
-
Train the single-image translation model:
sh ./scripts/train_sinsc.sh
As the multi-modal I2I translation model was trained on MUNIT, we would not plan to merge the code to this repository. If you wish to obtain multi-modal results, please contact us at chuanxia001@e.ntu.edu.sg.
- Test the single-modal I2I translation model:
sh ./scripts/test_sc.sh
- Test the single-image translation model:
sh ./scripts/test_sinsc.sh
- Test the FID score for all training epochs:
sh ./scripts/test_fid.sh
Download the pre-trained models (will be released soon) using the following links and put them undercheckpoints/
directory.
Single-modal translation model
: horse2zebra, semantic2image, apple2orangeSingle-image translation model
: image2monet
@inproceedings{zheng2021spatiallycorrelative,
title={The Spatially-Correlative Loss for Various Image Translation Tasks},
author={Zheng, Chuanxia and Cham, Tat-Jen and Cai, Jianfei},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
year={2021}
}
Our code is developed based on CUT and CycleGAN. We also thank pytorch-fid for FID computation, LPIPS for diversity score, and D&C for density and coverage evaluation.