aws-neuron/aws-neuron-sdk

Bad image quality for Stable Diffusion 1.5 after applying the optimized attenstion score

JingyaHuang opened this issue · 4 comments

Hi team,

Optimum Neuron received an issue that the images generated by Stable Diffusion 1.5 have bad quality compared to CPU ones: huggingface/optimum-neuron#607

After investigation, the issue seems to related to the optimized attention score suggested in the official AWS neuron sample. And we find it hard to understand why it causes the quality drop since it should be the same as the original one theoretically and the issue doesn't appear on other SD models(eg. SD2, SDXL).

Could the team share more insight about it? Thx!

Thanks @JingyaHuang,

I've reached out to the relevant engineers. We'll get someone looking at this and update accordingly.

Hi @JingyaHuang , I was able to reproduce the images on CPU fp32. I tested on a future release version of Neuron and the images match CPU. Could you provide which Neuron version you were using (specifically neuronx-cc version would be helpful), so I can also test that.

Also for your Neuron run, was that with bf16 or fp32? Its determined mainly by the types of the example input tensors and model weights at compile time.

Hi @aws-bhegedus,

Here is the setup in which I met the image quality issue:

  • Neuron SDK
aws-neuronx-collectives/unknown,now 2.20.22.0-c101c322e amd64 [installed]
aws-neuronx-dkms/unknown,now 2.16.7.0 amd64 [installed]
aws-neuronx-runtime-lib/unknown,now 2.20.22.0-1b3ca6425 amd64 [installed]
aws-neuronx-tools/unknown,now 2.17.1.0 amd64 [installed]
  • pip
aws-neuronx-runtime-discovery 2.9
diffusers                     0.28.2
libneuronxla                  2.0.965
neuronx-cc                    2.13.66.0+6dfecc895
neuronx-distributed           0.7.0
optimum                       1.20.0
optimum-neuron                0.0.24.dev0
peft                          0.10.0
sentence-transformers         2.6.1
torch                         2.1.2
torch-neuronx                 2.1.2.2.1.0
torch-xla                     2.1.2
torchvision                   0.16.2
transformers                  4.41.1
transformers-neuronx          0.10.0.21

And a branch in optimum-neuron to reproduce the issue that we observed:
https://github.com/huggingface/optimum-neuron/tree/restore-optimized-attn-score-sd15

The input tensors are in float32 / int64 and weights are downcast to bf16 for matmul.

Thanks, team, for helping us find out the root issue causing this. A fix will be put in place by this PR: huggingface/optimum-neuron#646.

Thanks again for the help!