I only put the resources (e.g. papers, blogs, etc.) that I have read and found interesting in this list. So the update speed would be depending on my speed of reading stuffs.


Catalogue:


1. Vision: [Back to Top]

1.1. Text-to-Image Generation:

  • "AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities" Zhongzhi Chen, Guang Liu, Bo-Wen Zhang, Fulong Ye, Qinghong Yang, Ledell Wu; [arxiv][code]
  • "Hierarchical Text-Conditional Image Generation with CLIP Latents" Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, Mark Chen; [arxiv]
  • "High-Resolution Image Synthesis with Latent Diffusion Models" Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, Björn Ommer; [arxiv]

1.2. Object Detection:

  • "DiffusionDet: Diffusion Model for Object Detection" Shoufa Chen, Peize Sun, Yibing Song, Ping Luo; [arxiv][code]

1.3. Image Generation:

  • "DENOISING DIFFUSION IMPLICIT MODELS" Jiaming Song, Chenlin Meng, Stefano Ermon; [arxiv][code]

  • "Adding Conditional Control to Text-to-Image Diffusion Models" Lvmin Zhang and Maneesh Agrawala; [arxiv][code]


2. Language: [Back to Top]

2.1. Text Generation:

  • "Diffusion-LM Improves Controllable Text Generation" Xiang Lisa Li, John Thickstun, Ishaan Gulrajani, Percy Liang, Tatsunori B. Hashimoto; [arxiv][code]
  • "DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models" Zhengfu He, Tianxiang Sun, Kuanning Wang, Xuanjing Huang, Xipeng Qiu; [arxiv][code].
  • "GENIE: Large Scale Pre-training for Generation with Diffusion Model" Zhenghao Lin, Yeyun Gong, Yelong Shen, Tong Wu, Zhihao Fan, Chen Lin, Weizhu Chen, Nan Duan; [arxiv]
  • "Difformer: Empowering Diffusion Model on Embedding Space for Text Generation" Zhujin Gao, Junliang Guo, Xu Tan, Yongxin Zhu, Fang Zhang, Jiang Bian, Linli Xu; [arxiv]
  • "DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models" Shansan Gong, Mukai Li, Jiangtao Feng, Zhiyong Wu, Lingpeng Kong; [arxiv][code]
  • "Latent Diffusion for Language Generation" Justin Lovelace, Varsha Kishore, Chao Wan, Eliot Shekhtman, Kilian Weinberger; [arxiv]

3. Vision and Language: [Back to Top]

3.1. Image Captioning:

  • "Exploring Discrete Diffusion Models for Image Captioning" Zixin Zhu, Yixuan Wei, Jianfeng Wang, Zhe Gan, Zheng Zhang, Le Wang, Gang Hua, Lijuan Wang, Zicheng Liu, Han Hu; [arxiv][code]

4. Other Topics: [Back to Top]

  • "CARD: Classification and Regression Diffusion Models" Xizewen Han, Huangjie Zheng, Mingyuan Zhou; [arxiv][code]

5. Blogs and Other Resources: [Back to Top]

  • "The Illustrated Stable Diffusion" Jay Alammar; [link]
  • "How diffusion models work: the math from scratch" Sergios Karagiannakos, Nikolas Adaloglou; [link]
  • Github Repo for minimal-text-diffusion