On an Intel Mac with discrete GPU, when using the GPU, the generation outputs some kind of random pattern
bdev36 opened this issue · 5 comments
Running on an Intel iMac (2020) with a discrete Radeon 5700 (8GB), the result is always something in the like of the attached screenshot.
I've cloned the repository :
- Diffusion-macOS: the problem is identical. The GPU is doing the work but the result is random pixels.
- Diffusion-macOS: using
ComputeUnits.cpuOnly(two modifications toControlsView.swift), the CPU is (slowly) doing the work and the result is OK. - Diffusion: the CPU is doing the work and the result is OK.
In all cases, no error or exception is raised.
The console output is very similar :
Generating...
Got images: [Optional(<CGImage 0x7f813e3b19c0> (IP)
<<CGColorSpace 0x60000192dda0> (kCGColorSpaceDeviceRGB)>
width = 512, height = 512, bpc = 8, bpp = 24, row bytes = 1536
kCGImageAlphaNone | 0 (default byte order) | kCGImagePixelFormatPacked
is mask? No, has masking color? No, has soft mask? No, has matte? No, should interpolate? Yes)] in 17.003490924835205
Diffusion also outputs this, for each step :
2023-04-22 17:30:02.375841+0200 Diffusion[7894:267125] [API] cannot add handler to 3 from 3 - dropping
same output image.
I'm testing stable-diffusion-webui, if the option --no-half is set it can generate correct image, otherwise it will output black image. I think the problem is related to model accuracy.
Running on an Intel iMac (2020) with a discrete Radeon 5700 (8GB), the result is always something in the like of the attached screenshot.
I've cloned the repository :
- Diffusion-macOS: the problem is identical. The GPU is doing the work but the result is random pixels.
- Diffusion-macOS: using
ComputeUnits.cpuOnly(two modifications toControlsView.swift), the CPU is (slowly) doing the work and the result is OK.- Diffusion: the CPU is doing the work and the result is OK.
In all cases, no error or exception is raised.
The console output is very similar :
Generating... Got images: [Optional(<CGImage 0x7f813e3b19c0> (IP) <<CGColorSpace 0x60000192dda0> (kCGColorSpaceDeviceRGB)> width = 512, height = 512, bpc = 8, bpp = 24, row bytes = 1536 kCGImageAlphaNone | 0 (default byte order) | kCGImagePixelFormatPacked is mask? No, has masking color? No, has soft mask? No, has matte? No, should interpolate? Yes)] in 17.003490924835205Diffusion also outputs this, for each step :
2023-04-22 17:30:02.375841+0200 Diffusion[7894:267125] [API] cannot add handler to 3 from 3 - dropping![]()
I found a solution. You have to manually convert the model and set the precision to FP32.
Here is an example. https://github.com/apple/ml-stable-diffusion
coreml_model = ct.convert(
torchscript_module,
convert_to="mlprogram",
minimum_deployment_target=ct.target.macOS13,
inputs=_get_coreml_inputs(sample_inputs, args),
outputs=[ct.TensorType(name=name) for name in output_names],
compute_units=ct.ComputeUnit[args.compute_unit],
compute_precision=ct.precision.FLOAT32,
# skip_model_load=True,
)
Nice finding, thanks!
@pcuenca To answer your question (Debugging the app under Xcode, with default settings, the labrador prompt and timing the second generation only).
It's 25/1 performance ratio :
- Built-in 2.1 model on CPU (8 cores i7 / 3.8Ghz) : 446s
- The same model converted manually with the FLOAT32 on GPU (Radeon pro 5700 with 8GB) + CPU : 17s
Thanks again to @Kila2 and you. It's now working perfectly on the GPU.
