Project homepage || Paper || Model || Human-centric Dataset [code: asd4]
Jie Zhu, Yixiong Chen, Mingyu Ding, Ping Luo, Leye Wang†, Jingdong Wang†
Peking University, Johns Hopkins University, UC Berkeley, The University of Hong Kong, Baidu
This is an official implementation of MoLE, which is a human-centric text-to-image diffusion model. We provide the code for SD v1.5 and SDXL, respectively.
Pleae see requirements.txt. We provide the xformers file used in our environment in here
Download the Human-centric Dataset [code: asd4].
This dataset involves three subsets:human-in-the-scene images, close-up of face images, and close-up of hand images, totally one million images. Moreover these images possess superior quality and boasts high aesthetic scores.
We also provide the scripts of downloading raw images from corresponding websites. See directory ./climb_scripts
NOTE: Our dataset is allowed for academic purposes only. When using it, the users are requested to ensure compliance with legal regulations. See LICENSE.txt for details.
We thank the authors of XFormer for providing us with a great library. Our code is based on sd-scripts. Thank the authors. We also thank Stability.ai for its open source.