Issue with running repo

Question

Issue with running repo

filliptm opened this issue 8 months ago · 3 comments

Answer 1 · 2024-02-02T20:30:48.000Z

First, try creating a conda environment using the instructions in the repository, then run from there. Using the latest Torch 2 version and latest Xformers, I'm able to train.

git clone https://github.com/ExponentialML/AnimateDiff-MotionDirector
cd AnimateDiff-MotionDirector

conda env create -f environment.yaml
conda activate animatediff

pip install -r requirements.txt

If that fails to work, then try the below in a conda enviornment.

The issue seems to be with your version of Xformers, which is required to run the training code (the AnimateDiff code was built off of an old version of Diffusers, and isn't updated to work with native Torch 2.0 SDP).

pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 xformers==0.0.16 --extra-index-url https://download.pytorch.org/whl/cu117

These are the used versions in the AnimateDiff repository. In my tests, while this may work, I've had convergence problems with this specific version of Torch & Xformers, so you mileage may vary.

Answer 2 · 2024-02-10T03:30:24.000Z

I'm having the same problem. Using the latest versions installed with the requirements.txt, it gave me the following error:

WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for: PyTorch 2.2.0+cu121 with CUDA 1201 (you have 2.2.0+cpu) Python 3.10.11 (you have 3.10.13) Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers) Memory-efficient attention, SwiGLU, sparse and more won't be available. Set XFORMERS_MORE_DETAILS=1 for more details

and using the above suggestion, it gave the triton missing message:

A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton'

Installing the pytorch-triton, it gave this error:

Traceback (most recent call last): File "X:\ai\AnimateDiff-MotionDirector\train.py", line 20, in <module> import diffusers File "C:\Users\Admin\anaconda3\envs\AnimateDiff-MotionDirector\lib\site-packages\diffusers\__init__.py", line 28, in <module> from .models import ( File "C:\Users\Admin\anaconda3\envs\AnimateDiff-MotionDirector\lib\site-packages\diffusers\models\__init__.py", line 19, in <module> from .attention import Transformer2DModel File "C:\Users\Admin\anaconda3\envs\AnimateDiff-MotionDirector\lib\site-packages\diffusers\models\attention.py", line 43, in <module> import xformers.ops File "C:\Users\Admin\anaconda3\envs\AnimateDiff-MotionDirector\lib\site-packages\xformers\ops\__init__.py", line 8, in <module> from .fmha import ( File "C:\Users\Admin\anaconda3\envs\AnimateDiff-MotionDirector\lib\site-packages\xformers\ops\fmha\__init__.py", line 10, in <module> from . import cutlass, flash, small_k, triton File "C:\Users\Admin\anaconda3\envs\AnimateDiff-MotionDirector\lib\site-packages\xformers\ops\fmha\triton.py", line 15, in <module> if TYPE_CHECKING or _is_triton_available(): File "C:\Users\Admin\anaconda3\envs\AnimateDiff-MotionDirector\lib\site-packages\xformers\__init__.py", line 33, in func_wrapper value = func() File "C:\Users\Admin\anaconda3\envs\AnimateDiff-MotionDirector\lib\site-packages\xformers\__init__.py", line 44, in _is_triton_available from xformers.triton.softmax import softmax as triton_softmax # noqa File "C:\Users\Admin\anaconda3\envs\AnimateDiff-MotionDirector\lib\site-packages\xformers\triton\__init__.py", line 12, in <module> from .dropout import FusedDropoutBias, dropout # noqa File "C:\Users\Admin\anaconda3\envs\AnimateDiff-MotionDirector\lib\site-packages\xformers\triton\dropout.py", line 13, in <module> import triton File "C:\Users\Admin\anaconda3\envs\AnimateDiff-MotionDirector\lib\site-packages\triton\__init__.py", line 1, in <module> raise RuntimeError("Should never be installed") RuntimeError: Should never be installed

Any clue? Thanks in advance! Really eager to test your amazing code!! ^_^

Answer 3 · 2024-02-12T23:47:33.000Z

I followed the instructions in #2 (comment) as i had the same issue as above.

I now have the triton missing issue also (running windows 11)
However it is still running the training on cpu it seems.

I do have a threadripper pro 5955WX and 124gb ram so thats probably carrying it also.

(animatediff-motiondirector) C:\Users\VA-AI.RENDER\Documents\AnimateDiff-MotionDirector>python train.py --config ./configs/training/motion_director/my_video.yaml
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
Cannot initialize model with low cpu memory usage because `accelerate` was not found in the environment. Defaulting to `low_cpu_mem_usage=False`. It is strongly recommended to install `accelerate` for faster and less memory-intense model loading. You can do so with:

pip install accelerate

.
loaded 3D unet's pretrained weights from models/StableDiffusion/stable-diffusion-v1-5\unet ...
### missing keys: 560;
### unexpected keys: 0;
### Motion Module Parameters: 417.1376 M
Caching Latents.:   0%|                                                                          | 0/1 [00:00<?, ?it/s]A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
Caching Latents.: 100%|██████████████████████████████████████████████████████████████████| 1/1 [00:08<00:00,  8.31s/it]
load motion module from models/Motion_Module/v3_sd15_mm.ckpt
Using LoRA Version: cloneofsimo
Lora successfully injected into UNet3DConditionModel.
Using LoRA Version: cloneofsimo
Lora successfully injected into UNet3DConditionModel.
02/13/2024 10:40:43 - INFO - root - ***** Running training *****
02/13/2024 10:40:43 - INFO - root -   Num examples = 1
02/13/2024 10:40:43 - INFO - root -   Num Epochs = 503
02/13/2024 10:40:43 - INFO - root -   Instantaneous batch size per device = 1
02/13/2024 10:40:43 - INFO - root -   Total train batch size (w. parallel, distributed & accumulation) = 1
02/13/2024 10:40:43 - INFO - root -   Gradient Accumulation steps = 1
02/13/2024 10:40:43 - INFO - root -   Total optimization steps = 503
Steps:   0%|                                                                                   | 0/503 [00:00<?, ?it/s]A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
100%|██████████████████████████████████████████████████████████████████████████████████| 16/16 [00:00<00:00, 18.93it/s]
Dataset sanity check...: 100%|███████████████████████████████████████████████████████████| 1/1 [00:02<00:00,  2.04s/it]
C:\Users\VA-AI.RENDER\AppData\Local\miniconda3\envs\animatediff-motiondirector\lib\site-packages\torch\optim\lr_scheduler.py:138: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
  warnings.warn("Detected call of `lr_scheduler.step()` before `optimizer.step()`. "
C:\Users\VA-AI.RENDER\AppData\Local\miniconda3\envs\animatediff-motiondirector\lib\site-packages\torch\optim\lr_scheduler.py:257: UserWarning: To get the last learning rate computed by the scheduler, please use `get_last_lr()`.
  warnings.warn("To get the last learning rate computed by the scheduler, "
Steps:   0%| | 2/503 [00:08<30:27,  3.65s/it, Spatial LR=1e-5, Spatial Loss=0.0355, Temporal LR=5e-5, Temporal Loss=0.0a highly realistic video of batman running in a mystic forest, depth of field, epic lights, high quality, trending on artstation
100%|██████████████████████████████████████████████████████████████████████████████████| 25/25 [00:19<00:00,  1.30it/s]
100%|██████████████████████████████████████████████████████████████████████████████████| 16/16 [00:00<00:00, 25.07it/s]
02/13/2024 10:41:13 - INFO - root - Saved samples to outputs\2024-02-13\man_running_my_video-10-40-22/samples/sample-2.gif