Improved AnimateDiff integration for ComfyUI, initially adapted from sd-webui-animatediff but changed greatly since then. Please read the AnimateDiff repo README for more information about how it works at its core.
Examples shown here will also often make use of two helpful set of nodes:
- ComfyUI-Advanced-ControlNet for loading files in batches and controlling which latents should be affected by the ControlNet inputs (work in progress, will include more advance workflows + features for AnimateDiff usage later).
- comfy_controlnet_preprocessors for ControlNet preprocessors not present in vanilla ComfyUI; this repo is archived, and future development by the dev will happen here: comfyui_controlnet_aux. While most preprocessors are common between the two, some give different results. Workflows linked here use the archived version, comfy_controlnet_preprocessors.
- Clone this repo into
custom_nodes
folder. - Download motion modules from Google Drive | HuggingFace | CivitAI | Baidu NetDisk. You can download one or more motion models. Place models in ComfyUI/custom_nodes/ComfyUI-AnimateDiff/models. They can be renamed if you want. More motion modules are being trained by the community - if I am made aware of any good ones, I will link here as well. (TODO: create .safetensor versions of the motion modules and share them here.)
- Get creative! If it works for normal image generation, it (probably) will work for AnimateDiff generations. Latent upscales? Go for it. ControlNets, one or more stacked? You betcha. Masking the conditioning of ControlNets to only affect part of the animation? Sure. Try stuff and you will be surprised by what you can do.
Large resolutions may cause xformers to throw a CUDA error concerning a misconfigured value despite being within VRAM limitations.
It is an xformers bug accidentally triggered by the way the original AnimateDiff CrossAttention is passed in. Eventually either I will fix it, or xformers will. When encountered, the workaround is to boot ComfyUI with the "--disable-xformers" argument.
Training data used by the authors of the AnimateDiff paper contained Shutterstock watermarks. Since mm_sd_v15 was finetuned on finer, less drastic movement, the motion module attempts to replicate the transparency of that watermark and does not get blurred away like mm_sd_v14. Community finetunes of motion modules should eventually create equivalent (or better) results without the watermark. Until then, you'll need some good RNG or stick with mm_sd_v14, depending on your application.
Samples (still images of animation [not the workflow images] contains embeded workflow - download and drag it into ComfyUI to instantly load the workflow)
txt2img w/ ControlNet-stabilized latent-upscale (partial denoise on upscale, Scaled Soft ControlNet Weights)
(open_pose images provided courtesy of toyxyz)
- Nodes for saving videos, saving generated files into a timestamped folder instead of all over ComfyUI output dir.
- Moving-window latent implementation for generating arbitrarily-long animations instead of being capped at 24 frames (moving window will still be limited to up to 24 frames).
- Add examples of using ControlNet to ease one image/controlnet input into another, and also add more nodes to Advanced-ControlNet to make it easier to do so