/ComfyUI-AnimateDiff

AnimateDiff for ComfyUI

Primary LanguagePython

AnimateDiff for ComfyUI

Improved AnimateDiff integration for ComfyUI, initially adapted from sd-webui-animatediff but changed greatly since then. Please read the AnimateDiff repo README for more information about how it works at its core.

Examples shown here will also often make use of two helpful set of nodes:

  • ComfyUI-Advanced-ControlNet for loading files in batches and controlling which latents should be affected by the ControlNet inputs (work in progress, will include more advance workflows + features for AnimateDiff usage later).
  • comfy_controlnet_preprocessors for ControlNet preprocessors not present in vanilla ComfyUI; this repo is archived, and future development by the dev will happen here: comfyui_controlnet_aux. While most preprocessors are common between the two, some give different results. Workflows linked here use the archived version, comfy_controlnet_preprocessors.

How to Use

  1. Clone this repo into custom_nodes folder.
  2. Download motion modules from Google Drive | HuggingFace | CivitAI | Baidu NetDisk. You can download one or more motion models. Place models in ComfyUI/custom_nodes/ComfyUI-AnimateDiff/models. They can be renamed if you want. More motion modules are being trained by the community - if I am made aware of any good ones, I will link here as well. (TODO: create .safetensor versions of the motion modules and share them here.)
  3. Get creative! If it works for normal image generation, it (probably) will work for AnimateDiff generations. Latent upscales? Go for it. ControlNets, one or more stacked? You betcha. Masking the conditioning of ControlNets to only affect part of the animation? Sure. Try stuff and you will be surprised by what you can do.

Known Issues

Large resolutions may cause xformers to throw a CUDA error concerning a misconfigured value despite being within VRAM limitations.

It is an xformers bug accidentally triggered by the way the original AnimateDiff CrossAttention is passed in. Eventually either I will fix it, or xformers will. When encountered, the workaround is to boot ComfyUI with the "--disable-xformers" argument.

GIF has Watermark (especially when using mm_sd_v15)

Training data used by the authors of the AnimateDiff paper contained Shutterstock watermarks. Since mm_sd_v15 was finetuned on finer, less drastic movement, the motion module attempts to replicate the transparency of that watermark and does not get blurred away like mm_sd_v14. Community finetunes of motion modules should eventually create equivalent (or better) results without the watermark. Until then, you'll need some good RNG or stick with mm_sd_v14, depending on your application.

Samples (still images of animation [not the workflow images] contains embeded workflow - download and drag it into ComfyUI to instantly load the workflow)

txt2img

txt2image_workflow

AA_gif_00002_

AA_gif_00001_

txt2img w/ latent upscale (partial denoise on upscale)

txt2image_upscale_partialdenoise_workflow

AA_upscale_gif_00007_

AA_upscale_gif_00001_

txt2img w/ latent upscale (full denoise on upscale)

txt2image_upscale_workflow

AA_upscale_gif_00001_

AA_upscale_gif_00002_

txt2img w/ ControlNet-stabilized latent-upscale (partial denoise on upscale, Scaled Soft ControlNet Weights)

txt2image_upscale_controlnetsoftweights_partialdenoise_workflow

AA_upscale_gif_00009_

AA_upscale_gif_00003_

txt2img w/ ControlNet-stabilized latent-upscale (full denoise on upscale)

txt2image_upscale_controlnet_workflow

AA_upscale_controlnet_gif_00006_

AA_upscale_gif_00004_

txt2img w/ Initial ControlNet input (using LineArt preprocessor on first txt2img as an example)

txt2image_controlnet_workflow

AA_controlnet_gif_00017_

AA_gif_00006_

txt2img w/ Initial ControlNet input (using OpenPose images) + latent upscale w/ full denoise

txt2image_openpose_controlnet_upscale_workflow

(open_pose images provided courtesy of toyxyz) AA_openpose_cn_gif_00001_

AA_gif_00029_

AA_gif_00008_

img2img (TODO: this is outdated and still shows the old flickering version, update this)

Screenshot 2023-07-22 at 22 08 00

AnimateDiff_00002

Upcoming features (aka TODO):

  • Nodes for saving videos, saving generated files into a timestamped folder instead of all over ComfyUI output dir.
  • Moving-window latent implementation for generating arbitrarily-long animations instead of being capped at 24 frames (moving window will still be limited to up to 24 frames).
  • Add examples of using ControlNet to ease one image/controlnet input into another, and also add more nodes to Advanced-ControlNet to make it easier to do so