/animatediff-cli

a CLI utility/library for AnimateDiff stable diffusion generation

Primary LanguagePythonApache License 2.0Apache-2.0

animatediff

pre-commit.ci status

animatediff refactor, because I can. with significantly lower VRAM usage.

Also, infinite generation length support! yay!

LoRA loading is ABSOLUTELY NOT IMPLEMENTED YET!

PRs welcome! 😆😅

This can theoretically run on CPU, but it's not recommended. Should work fine on a GPU, nVidia or otherwise, but I haven't tested on non-CUDA hardware. Uses PyTorch 2.0 Scaled-Dot-Product Attention (aka builtin xformers) by default, but you can pass --xformers to force using xformers if you really want.

How to use

I should write some more detailed steps, but here's the gist of it:

git clone https://github.com/neggles/animatediff-cli
cd animatediff-cli
python3.10 -m venv .venv
source .venv/bin/activate
# install Torch. Use whatever your favourite torch version >= 2.0.0 is, but, good luck on non-nVidia...
python -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
# install the rest of all the things (probably! I may have missed some deps.)
python -m pip install -e '.[dev]'
# you should now be able to
animatediff --help
# There's a nice pretty help screen with a bunch of info that'll print here.

From here you'll need to put whatever checkpoint you want to use into data/models/sd, copy one of the prompt configs in config/prompts, edit it with your choices of prompt and model (model paths in prompt .json files are relative to data/, e.g. models/sd/vanilla.safetensors), and off you go.

Then it's something like (for an 8GB card):

animatediff generate -c 'config/prompts/waifu.json' -W 576 -H 576 -L 128 -C 16

You may have to drop -C down to 8 on cards with less than 8GB VRAM, and you can raise it to 20-24 on cards with more. 24 is max.

N.B. generating 128 frames is slow...

RiFE!

I have added experimental support for rife-ncnn-vulkan using the animatediff rife interpolate command. It has fairly self-explanatory help, and it has been tested on Linux, but I've no idea if it'll work on Windows.

Either way, you'll need ffmpeg installed on your system and present in PATH, and you'll need to download the rife-ncnn-vulkan release for your OS of choice from the GitHub repo (above). Unzip it, and place the extracted folder at data/rife/. You should have a data/rife/rife-ncnn-vulkan executable, or data\rife\rife-ncnn-vulkan.exe on Windows.

You'll also need to reinstall the repo/package with:

python -m pip install -e '.[rife]'

or just install ffmpeg-python manually yourself.

Default is to multiply each frame by 8, turning an 8fps animation into a 64fps one, then encode that to a 60fps WebM. (If you pick GIF mode, it'll be 50fps, because GIFs are cursed and encode frame durations as 1/100ths of a second).

Seems to work pretty well...

TODO:

In no particular order:

  • Infinite generation length support
  • RIFE support for motion interpolation (rife-ncnn-vulkan isn't the greatest implementation)
  • Export RIFE interpolated frames to a video file (webm, mp4, animated webp, hevc mp4, gif, etc.)
  • Generate infinite length animations on a 6-8GB card (at 512x512 with 8-frame context, but hey it'll do)
  • Torch SDP Attention (makes xformers optional)
  • Support for clip_skip in prompt config
  • Experimental support for torch.compile() (upstream Diffusers bugs slow this down a little but it's still zippy)
  • Batch your generations with --repeat! (e.g. --repeat 10 will repeat all your prompts 10 times)
  • Call the animatediff.cli.generate() function from another Python program without reloading the model every time
  • Drag remaining old Diffusers code up to latest (mostly)
  • Add a webUI (maybe, there are people wrapping this already so maybe not?)
  • img2img support (start from an existing image and continue)
  • Stop using custom modules where possible (should be able to use Diffusers for almost all of it)
  • Automatic generate-then-interpolate-with-RIFE mode

Credits:

see guoyww/AnimateDiff (very little of this is my work)

n.b. the copyright notice in COPYING is missing the original authors' names, solely because the original repo (as of this writing) has no name attached to the license. I have, however, used the same license they did (Apache 2.0).