Announcement: BLIP is now officially integrated into CLIPTextEncode
- Fairscale>=0.4.4 (NOT in ComfyUI)
- Transformers==4.26.1 (already in ComfyUI)
- Timm>=0.4.12 (already in ComfyUI)
- Gitpython (already in ComfyUI)
Inside ComfyUI_windows_portable\python_embeded, run:
python.exe -m pip install fairscale
And, inside ComfyUI_windows_portable\ComfyUI\custom_nodes, run:
git clone https://github.com/paulo-coronado/comfy_clip_blip_node
Add a cell with the following code:
!pip install fairscale !cd custom_nodes && git clone https://github.com/paulo-coronado/comfy_clip_blip_node
- Add the CLIPTextEncodeBLIP node;
- Connect the node with an image and select a value for min_length and max_length;
- Optional: if you want to embed the BLIP text in a prompt, use the keyword BLIP_TEXT (e.g. "a photo of BLIP_TEXT", medium shot, intricate details, highly detailed).
The implementation of CLIPTextEncodeBLIP relies on resources from BLIP, ALBEF, Huggingface Transformers, and timm. We thank the original authors for their open-sourcing.