改用fp16之后Unet精度不一致
Closed this issue · 10 comments
unet用from_pretrained()
加载的时候好像没有传入dtype参数,导致其他是fp16,unet是fp32
Traceback (most recent call last):
File "/home/yueye/anaconda3/envs/ldm/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/yueye/anaconda3/envs/ldm/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/yueye/code/AnimateDiff/scripts/animate.py", line 237, in <module>
main(args)
File "/home/yueye/code/AnimateDiff/scripts/animate.py", line 189, in main
sample = pipeline(
File "/home/yueye/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/yueye/code/AnimateDiff/animatediff/pipelines/stablediffusion_controlnet_reference_animatediff_pipeline.py", line 251, in __call__
ref_image_latents = self.prepare_ref_latents(
File "/home/yueye/code/AnimateDiff/animatediff/pipelines/stablediffusion_controlnet_reference_animatediff_pipeline.py", line 74, in prepare_ref_latents
ref_image_latents = self.vae.encode(refimage).latent_dist.sample(generator=generator)
File "/home/yueye/code/AnimateDiff/diffusers/src/diffusers/utils/accelerate_utils.py", line 46, in wrapper
return method(self, *args, **kwargs)
File "/home/yueye/code/AnimateDiff/diffusers/src/diffusers/models/autoencoder_kl.py", line 242, in encode
h = self.encoder(x)
File "/home/yueye/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yueye/code/AnimateDiff/diffusers/src/diffusers/models/vae.py", line 111, in forward
sample = self.conv_in(sample)
File "/home/yueye/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yueye/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/home/yueye/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (float) and bias type (c10::Half) should be the same
这样改 5bdbfeb
改用float16之后还是有out of memory的情况,出现在使用animatediff/pipelines/stablediffusion_controlnet_reference_animatediff_pipeline.py
的self.unet()
我用的是3090 24G显存,不清楚你提到的float16 13G显存是怎么做到的……
建议你不带controlnet先跑通 另外检查xfomers
建议你不带controlnet先跑通 另外检查xfomers
disable IP-Adapter 和 ControlNet 之后可以跑通了,但是还有个问题,保存的sample.gif
为什么是静止的图像?是我保存的有问题么?
检查每帧图像
from PIL import Image
def gif_to_images(gif_path, save_dir):
with Image.open(gif_path) as im:
# 获取 GIF 中的每一帧
frames = []
for frame in range(im.n_frames):
im.seek(frame)
frames.append(im.copy())
# 逐帧保存为静态图片
for i, frame in enumerate(frames):
# 生成保存路径,假设保存路径形如 save_dir/frame1.png, save_dir/frame2.png, ...
save_path = f"{save_dir}/frame{i+1}.png"
frame.save(save_path, "PNG")
print(f"Saved frame {i+1} as {save_path}")
检查每帧图像
from PIL import Image def gif_to_images(gif_path, save_dir): with Image.open(gif_path) as im: # 获取 GIF 中的每一帧 frames = [] for frame in range(im.n_frames): im.seek(frame) frames.append(im.copy()) # 逐帧保存为静态图片 for i, frame in enumerate(frames): # 生成保存路径,假设保存路径形如 save_dir/frame1.png, save_dir/frame2.png, ... save_path = f"{save_dir}/frame{i+1}.png" frame.save(save_path, "PNG") print(f"Saved frame {i+1} as {save_path}")
确实是16帧没错,但是每帧都基本一致,没有变化
all:
base: ""
lora: ""
init_image: "__assets__/ipadapter/An_astronaut_is_riding_a_horse_on_Mars_seed-444264997.png"
denoise_strength: 0.84
enable_ipadapter: false
ip_strength: 1
enable_controlnet: true
controlnet_name: reference
controlnet_image: "__assets__/ipadapter/An_astronaut_is_riding_a_horse_on_Mars_seed-444264997.png"
motion_module:
- "models/Motion_Module/mm_sd_v15_v2.ckpt"
seed: [444264997]
steps: 25
guidance_scale: 7.5
lora_alpha: 0.8
prompt:
- "An astronaut is riding a horse on Mars"
n_prompt:
- "monochrome, lowres, bad anatomy, worst quality, low quality"
@KyleYueye 这组参数就是这个效果,controlnet目前默认应用在每一帧,reference太强了,需要调整可以修改https://github.com/ykk648/AnimateDiff/blob/5bdbfeb3e92dee379f9c543930aa591f89a5b04f/animatediff/pipelines/stablediffusion_controlnet_reference_animatediff_pipeline.py#L260
ref_image_latents = ref_image_latents.unsqueeze(2).repeat(1, 1, video_length, 1, 1)
好的,非常感谢,但是我关了controlnet好像也几乎静止
我得到的结果也几乎静止,而且我看代码里面没有用到 controlnet,请问要怎么调出 readme 上的效果呢?
都是有controlnet的