Official repository for the paper "CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching".
🌟 For more details, please refer to the project page: https://caraj7.github.io/comat/.
- [2024.04.05] 🚀 We release our paper on arXiv.
- Release training code in April.
We propose 💫CoMat, an end-to-end diffusion model fine-tuning strategy with an image-to-text concept matching mechanism. We leverage an image captioning model to measure image-to-text alignment and guide the diffusion model to revisit ignored tokens.
If you find CoMat useful for your research and applications, please kindly cite using this BibTeX:
@inproceedings{jiang2024comat,
title={CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching},
author={Dongzhi Jiang, Guanglu Song, Xiaoshi Wu, Renrui Zhang, Dazhong Shen, Zhuofan Zong, Yu Liu, Hongsheng Li},
booktitle={arXiv},
year={2024}
}