ykk648/AnimateDiff-I2V

改用fp16之后Unet精度不一致

Closed this issue · 10 comments

unet用from_pretrained()加载的时候好像没有传入dtype参数,导致其他是fp16,unet是fp32

Traceback (most recent call last):
  File "/home/yueye/anaconda3/envs/ldm/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/yueye/anaconda3/envs/ldm/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/yueye/code/AnimateDiff/scripts/animate.py", line 237, in <module>
    main(args)
  File "/home/yueye/code/AnimateDiff/scripts/animate.py", line 189, in main
    sample = pipeline(
  File "/home/yueye/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/yueye/code/AnimateDiff/animatediff/pipelines/stablediffusion_controlnet_reference_animatediff_pipeline.py", line 251, in __call__
    ref_image_latents = self.prepare_ref_latents(
  File "/home/yueye/code/AnimateDiff/animatediff/pipelines/stablediffusion_controlnet_reference_animatediff_pipeline.py", line 74, in prepare_ref_latents
    ref_image_latents = self.vae.encode(refimage).latent_dist.sample(generator=generator)
  File "/home/yueye/code/AnimateDiff/diffusers/src/diffusers/utils/accelerate_utils.py", line 46, in wrapper
    return method(self, *args, **kwargs)
  File "/home/yueye/code/AnimateDiff/diffusers/src/diffusers/models/autoencoder_kl.py", line 242, in encode
    h = self.encoder(x)
  File "/home/yueye/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/yueye/code/AnimateDiff/diffusers/src/diffusers/models/vae.py", line 111, in forward
    sample = self.conv_in(sample)
  File "/home/yueye/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/yueye/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 463, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/home/yueye/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (float) and bias type (c10::Half) should be the same
ykk648 commented

这样改 5bdbfeb

这样改 5bdbfeb

改用float16之后还是有out of memory的情况,出现在使用animatediff/pipelines/stablediffusion_controlnet_reference_animatediff_pipeline.pyself.unet()
我用的是3090 24G显存,不清楚你提到的float16 13G显存是怎么做到的……

ykk648 commented

建议你不带controlnet先跑通 另外检查xfomers

建议你不带controlnet先跑通 另外检查xfomers

disable IP-Adapter 和 ControlNet 之后可以跑通了,但是还有个问题,保存的sample.gif为什么是静止的图像?是我保存的有问题么?

ykk648 commented

检查每帧图像

from PIL import Image

def gif_to_images(gif_path, save_dir):
    with Image.open(gif_path) as im:
        # 获取 GIF 中的每一帧
        frames = []
        for frame in range(im.n_frames):
            im.seek(frame)
            frames.append(im.copy())

        # 逐帧保存为静态图片
        for i, frame in enumerate(frames):
            # 生成保存路径,假设保存路径形如 save_dir/frame1.png, save_dir/frame2.png, ...
            save_path = f"{save_dir}/frame{i+1}.png"
            frame.save(save_path, "PNG")
            print(f"Saved frame {i+1} as {save_path}")

检查每帧图像

from PIL import Image

def gif_to_images(gif_path, save_dir):
    with Image.open(gif_path) as im:
        # 获取 GIF 中的每一帧
        frames = []
        for frame in range(im.n_frames):
            im.seek(frame)
            frames.append(im.copy())

        # 逐帧保存为静态图片
        for i, frame in enumerate(frames):
            # 生成保存路径,假设保存路径形如 save_dir/frame1.png, save_dir/frame2.png, ...
            save_path = f"{save_dir}/frame{i+1}.png"
            frame.save(save_path, "PNG")
            print(f"Saved frame {i+1} as {save_path}")

确实是16帧没错,但是每帧都基本一致,没有变化

all:
  base: ""
  lora: ""

  init_image: "__assets__/ipadapter/An_astronaut_is_riding_a_horse_on_Mars_seed-444264997.png"
  denoise_strength: 0.84

  enable_ipadapter: false
  ip_strength: 1

  enable_controlnet: true
  controlnet_name: reference
  controlnet_image: "__assets__/ipadapter/An_astronaut_is_riding_a_horse_on_Mars_seed-444264997.png"

  motion_module:
    - "models/Motion_Module/mm_sd_v15_v2.ckpt"

  seed:           [444264997]
  steps:          25
  guidance_scale: 7.5
  lora_alpha: 0.8

  prompt:
    - "An astronaut is riding a horse on Mars"
  n_prompt:
    - "monochrome, lowres, bad anatomy, worst quality, low quality"
ykk648 commented

@KyleYueye 这组参数就是这个效果,controlnet目前默认应用在每一帧,reference太强了,需要调整可以修改https://github.com/ykk648/AnimateDiff/blob/5bdbfeb3e92dee379f9c543930aa591f89a5b04f/animatediff/pipelines/stablediffusion_controlnet_reference_animatediff_pipeline.py#L260

ref_image_latents = ref_image_latents.unsqueeze(2).repeat(1, 1, video_length, 1, 1)

好的,非常感谢,但是我关了controlnet好像也几乎静止

我得到的结果也几乎静止,而且我看代码里面没有用到 controlnet,请问要怎么调出 readme 上的效果呢?

ykk648 commented

enable_controlnet: true


都是有controlnet的