ComfyUI implementation for ELLA.
- [2024.4.22] Fix unstable quality of image while multi-batch. Add CLIP concat (support lora trigger words now).
- [2024.4.19] Documenting nodes
- [2024.4.19] Initial repo
The examples directory has workflow examples. You can directly load these images as workflow into ComfyUI for use.
🎉 It works with controlnet!
🎉 It works with lora trigger words by concat CLIP CONDITION!
And EMMA is working in progress.
Download or git clone this repository inside ComfyUI/custom_nodes/ directory. ComfyUI-ELLA
requires the latest version of ComfyUI. If something doesn't work be sure to upgrade.
cd ComfyUI/custom_nodes
git clone https://github.com/TencentQQGYLab/ComfyUI-ELLA
Next install dependencies.
cd ComfyUI-ELLA
pip install -r requirements.txt
These models must be placed in the corresponding directories under models.
Remember you can also use any custom location setting an ella
& ella_encoder
entry in the extra_model_paths.yaml
file.
ComfyUI/models/ella
, create it if not present.- Place ELLA Models here
ComfyUI/models/ella_encoder
, create it if not present.- Place FLAN-T5 XL Text Encoder here, it should be a folder of transfomers structure with config.json
In summary, you should have the following model directory structure:
ComfyUI/models/ella/
└── ella-sd1.5-tsc-t5xl.safetensors
ComfyUI/models/ella_encoder/
└── models--google--flan-t5-xl--text_encoder
├── config.json
├── model.safetensors
├── special_tokens_map.json
├── spiece.model
├── tokenizer_config.json
└── tokenizer.json
- XXX not implemented for 'Half'. See issue #12
- Support prompt weighting
- ComfyUI: https://github.com/comfyanonymous/ComfyUI
- Diffusers (borrowed timestep modules): https://github.com/huggingface/diffusers
@misc{hu2024ella,
title={ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment},
author={Xiwei Hu and Rui Wang and Yixiao Fang and Bin Fu and Pei Cheng and Gang Yu},
year={2024},
eprint={2403.05135},
archivePrefix={arXiv},
primaryClass={cs.CV}
}