/comfyui-animatediff

AnimateDiff for ComfyUI

Primary LanguagePython

AnimateDiff for ComfyUI

AnimateDiff integration for ComfyUI, adapts from sd-webui-animatediff. Please read the original repo README for more information.

How to Use

  1. Clone this repo into custom_nodes folder.
  2. Download motion modules and put them under comfyui-animatediff/models/.

Update 2023/09/25

Motion LoRA is now supported!

Download motion LoRAs and put them under comfyui-animatediff/loras/ folder.

Note: LoRAs only work with AnimateDiff v2 mm_sd_v15_v2.ckpt module.

New node: AnimateDiffLoraLoader

image

Example workflow: image

Workflow: lora.json

Samples:

image
image
image
image

Update 2023/09/21

Sliding Window is now available!

The sliding window feature enables you to generate GIFs without a frame length limit. It divides frames into smaller batches with a slight overlap. This feature is activated automatically when generating more than 16 frames. To modify the trigger number and other settings, utilize the SlidingWindowOptions node. See the sample workflow bellow.

Nodes

AnimateDiffLoader

image

AnimateDiffSampler

  • Mostly the same with KSampler
  • motion_module: use AnimateDiffLoader to load the motion module
  • inject_method: should left default
  • frame_number: animation length
  • latent_image: You can pass an EmptyLatentImage
  • sliding_window_opts: custom sliding window options
image

AnimateDiffCombine

  • Combine GIF frames and produce the GIF image
  • frame_rate: number of frame per second
  • loop_count: use 0 for infinite loop
  • save_image: should GIF be saved to disk
  • format: supports image/gif, image/webp (better compression), video/webm, video/h264-mp4, video/h265-mp4. To use video formats, you'll need ffmpeg installed and available in PATH
image

SlidingWindowOptions

Custom sliding window options

  • context_length: number of frame per window. Use 16 to get the best results. Reduce it if you have low VRAM.
  • context_stride:
    • 1: sampling every frame
    • 2: sampling every frame then every second frame
    • 3: sampling every frame then every second frame then every third frames
    • ...
  • context_overlap: overlap frames between each window slice
  • closed_loop: make the GIF a closed loop, will add more sampling step
image

LoadVideo

Load GIF or video as images. Usefull to load a GIF as ControlNet input.

  • frame_start: Skip some begining frames and start at frame_start
  • frame_limit: Only take frame_limit frames
image

Workflows

Simple txt2gif

image

Workflow: simple.json

Samples:

animate_diff_01

animate_diff_02

Long duration with sliding window

image

Workflow: sliding-window.json

Samples:

image
image

Latent upscale

Upscale latent output using LatentUpscale then do a 2nd pass with AnimateDiffSampler.

image

Workflow: latent-upscale.json

Samples: animate_diff_upscale

Using with ControlNet

You will need following additional nodes:

Animate with starting and ending images

  • Use LatentKeyframe and TimestampKeyframe from ComfyUI-Advanced-ControlNet to apply diffrent weights for each latent index.
  • Use 2 controlnet modules for two images with weights reverted.

image

Workflow: cn-2images.json

Samples:

Using GIF as ControlNet input

Using a GIF (or video, or a list of images) as ControlNet input.

image

Workflow: cn-vid2vid.json

Samples:

Known Issues

CUDA error: invalid configuration argument

It's an xformers bug accidentally triggered by the way the original AnimateDiff CrossAttention is passed in. The current workaround is to disable xformers with --disable-xformers when booting ComfyUI.

GIF split into multiple scenes

AnimateDiff_00007_

Work around:

  • Shorter your prompt and negative prompt
  • Reduce resolution. AnimateDiff is trained on 512x512 images so it works best with 512x512 output.
  • Disable xformers with --disable-xformers

GIF has Wartermark (especially when using mm_sd_v15)

See: continue-revolution/sd-webui-animatediff#31

Training data used by the authors of the AnimateDiff paper contained Shutterstock watermarks. Since mm_sd_v15 was finetuned on finer, less drastic movement, the motion module attempts to replicate the transparency of that watermark and does not get blurred away like mm_sd_v14. Try other community finetuned modules.