我帮你修正了代码

Question

我帮你修正了代码

602387193c opened this issue 4 months ago · 7 comments

那这个报错，到底怎么解决呢，我一点代码都不懂

Error occurred when executing AnyText: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same File "F:\comfyui-anytext666\ComfyUI_windows_portable\ComfyUI\execution.py", line 151, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\comfyui-anytext666\ComfyUI_windows_portable\ComfyUI\execution.py", line 81, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\comfyui-anytext666\ComfyUI_windows_portable\ComfyUI\execution.py", line 74, in map_node_over_list results.append(getattr(obj, func)(slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\comfyui-anytext666\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-AnyText\AnyText\nodes.py", line 258, in anytext_process x_samples, results, rtn_code, rtn_warning, debug_info = pipe(input_data, font_path=loader_out[0], params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\comfyui-anytext666\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-AnyText\AnyText\AnyText_scripts\AnyText_pipeline.py", line 225, in call encoder_posterior = self.model.encode_first_stage(masked_img[None, ...]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\comfyui-anytext666\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "F:\comfyui-anytext666\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-AnyText\AnyText\AnyText_scripts\ldm\models\diffusion\ddpm.py", line 870, in encode_first_stage return self.first_stage_model.encode(x) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\comfyui-anytext666\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-AnyText\AnyText\AnyText_scripts\ldm\models\autoencoder.py", line 83, in encode h = self.encoder(x) ^^^^^^^^^^^^^^^ File "F:\comfyui-anytext666\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1532, in wrappedcall_impl return self._call_impl(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\comfyui-anytext666\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1541, in callimpl return forward_call(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\comfyui-anytext666\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-AnyText\AnyText\AnyText_scripts\ldm\modules\diffusionmodules\model.py", line 523, in forward hs = [self.conv_in(x)] ^^^^^^^^^^^^^^^ File "F:\comfyui-anytext666\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1532, in wrappedcall_impl return self._call_impl(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\comfyui-anytext666\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1541, in callimpl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\comfyui-anytext666\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\conv.py", line 460, in forward return self._conv_forward(input, self.weight, self.bias) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\comfyui-anytext666\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\conv.py", line 456, in convforward return F.conv2d(input, weight, bias, self.stride, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

这是报错信息。

Answer 1 · 2024-06-30T09:36:24.000Z

这个错误提示说明模型的输入和权重的数据类型不匹配。具体来说,输入是 torch.cuda.FloatTensor 类型(GPU上的浮点张量),而权重是 torch.FloatTensor 类型(CPU上的浮点张量)。这通常是由于模型和数据没有正确地移动到同一设备(CPU或GPU)上造成的。

让我们一步步尝试解决这个问题:

首先,确保你的 AnyText_loader 节点的 "all_to_device" 参数设置为 True。这应该能确保模型的所有部分都被移动到正确的设备上。
在 AnyText 节点中,将 "device" 参数设置为 "cuda" (如果你有 NVIDIA GPU)或 "cpu" (如果你没有 GPU 或者想在 CPU 上运行)。不要使用 "auto"。
如果上述步骤没有解决问题,你可以尝试修改代码。虽然你说你不懂代码,但我会给你具体的指导:

打开文件 F:\comfyui-anytext666\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-AnyText\AnyText\AnyText_scripts\AnyText_pipeline.py

找到 call 方法(大约在第200行左右),在 encoder_posterior = self.model.encode_first_stage(masked_img[None, ...]) 这行之前,添加以下代码:
```
self.model = self.model.to(self.device)
masked_img = masked_img.to(self.device)
```
这会确保模型和输入数据都在同一设备上。
如果你仍然遇到问题,可以尝试在整个管道中使用 CPU。在 AnyText_loader 节点中,将 "use_fp16" 设置为 False,并在 AnyText 节点中将 "device" 设置为 "cpu"。
最后,确保你的 GPU 驱动程序是最新的,并且你的 CUDA 版本与 PyTorch 版本兼容。

如果这些步骤都不能解决问题,可能需要重新安装 ComfyUI 或者 AnyText 插件。在这种情况下,建议你寻求更专业的技术支持或在项目的 GitHub 页面上提出 issue。

记住,在修改任何文件之前,最好先备份你的工作。如果你不确定如何修改文件,可以寻求有编程经验的朋友的帮助。

Answer 2 · 2024-06-30T09:38:10.000Z

修改的是这个文件：
AnyText\AnyText_scripts\AnyText_pipeline.py

感谢您提供更多的上下文。基于您提供的代码，我们可以做一些调整来解决设备不匹配的问题。这里是修改后的代码：

# 确保模型在正确的设备上
self.model = self.model.to(self.device)

# 将masked_img移动到正确的设备并设置正确的数据类型
masked_img = masked_img.to(self.device)
if self.use_fp16:
    masked_img = masked_img.half()
else:
    masked_img = masked_img.float()

# 打印调试信息
print(f"Device: {self.device}")
print(f"Model device: {next(self.model.parameters()).device}")
print(f"Input device: {masked_img.device}")
print(f"Input dtype: {masked_img.dtype}")

# 执行编码
encoder_posterior = self.model.encode_first_stage(masked_img[None, ...])
masked_x = self.model.get_first_stage_encoding(encoder_posterior).detach()

if self.use_fp16:
    masked_x = masked_x.half()

info['masked_x'] = torch.cat([masked_x for _ in range(img_count)], dim=0)

这些修改做了以下几件事：

确保模型在正确的设备上。
将 masked_img 移动到与模型相同的设备上。
根据 self.use_fp16 的值设置 masked_img 的数据类型。
添加了一些打印语句来帮助调试。
修正了最后一行的语法错误（将 * 改为 _，img*count 改为 img_count）。

这些修改应该能解决设备不匹配的问题，并确保数据类型的一致性。

在进行这些修改后，保存文件并再次运行你的工作流程。如果仍然遇到问题，请查看控制台输出中的调试信息，这些信息将帮助我们进一步诊断问题。

记住，在修改代码之前最好先备份文件。如果你对修改代码感到不确定，可以寻求有编程经验的朋友的帮助。

Answer 3 · 2024-06-30T09:40:24.000Z

位置在220行后面，具体原理我也不懂，我学文科的，以上是Claude 3.5 sonnet给出的方案。
我为了解决你这个问题，专门建立了一个项目，把所有必要的代码都复制进去作为知识了！
Claude 3.5 sonnet，很顶啊！

Answer 4 · 2024-06-30T14:32:07.000Z

谢谢，回头有空试试，能跑通我就合并进去。
我都不知道还能这样把图片加载到设备，然后调整数据类型。
只要数据量够大，大模型索引处理确实强。

Answer 5 · 2024-07-01T10:09:37.000Z

我搞不懂为啥这个项目一直没人移植，反而是一些没什么大用处的项目，成堆的移植。难道项目移植难度很大？为此，我居然还想着自己来学一下怎么移植到comfyui，结果还好我搜索了下，找到你这个项目了，哈哈哈。这生成文字不是挺刚需的嘛，我就一直用原来的原生项目做视频的封面。

Answer 6 · 2024-07-01T14:43:17.000Z

我搞不懂为啥这个项目一直没人移植，反而是一些没什么大用处的项目，成堆的移植。难道项目移植难度很大？为此，我居然还想着自己来学一下怎么移植到comfyui，结果还好我搜索了下，找到你这个项目了，哈哈哈。这生成文字不是挺刚需的嘛，我就一直用原来的原生项目做视频的封面。

现在的程度难度倒不是很大，自从毕业以后从没接触过计算机编程，python都是差不多2个月前开始接触的，所以个人水平可想而知。
这个没人移植是因为受众少，做出来没什么热度，费力不讨好，大部分还是偏向于图像类。
我做这个也是一时兴起，几乎零基础零经验，只能靠读原项目代码、衍生项目代码、搜索引擎、参考各种插件、comfyui原生代码和试错慢慢解决的各种问题。
核心代码都是原anytext项目的，现在这插件也就套一层comfyui界面壳子，并未能原生整合。
更加原生整合应该是：添加model_patch节点然后再使用comfyui的部分原生节点，采样器、遮罩、图片输入等等，这些估计要涉及到更加底层的了。

Update: 代码已更新，感谢你的支持。

Answer 7 · 2024-07-02T03:21:29.000Z

那就好，我做了个工作流的，可以一键生成一批各种风格的：
https://www.bilibili.com/video/BV1Gz421z7E7/