chengzeyi/stable-fast

FP8 support in stable fast

jkrauss82 opened this issue · 6 comments

Is it planned?

Currently getting this error when trying to run ComfyUI in fp8 (flags --fp8_e4m3fn-text-enc --fp8_e4m3fn-unet):

RuntimeError: "addmm_cuda" not implemented for 'Float8_e4m3fn'

Is it planned?

Currently getting this error when trying to run ComfyUI in fp8 (flags --fp8_e4m3fn-text-enc --fp8_e4m3fn-unet):

RuntimeError: "addmm_cuda" not implemented for 'Float8_e4m3fn'

I'm quite sure stable fast has its own quantization node but it's not implemented in the node iirc

@jkrauss82 Sorry, FP8 kernels aren't implemented and I guess I lack the time to support them now.

Thanks for the reply, understood. It would be nice if it could be supported eventually.

@jkrauss82 I have created one new project which supports FP8 inference with diffusers. However, it has not been open-sourced. I hope it could be made publicly soon...

Is it planned?
Currently getting this error when trying to run ComfyUI in fp8 (flags --fp8_e4m3fn-text-enc --fp8_e4m3fn-unet):

RuntimeError: "addmm_cuda" not implemented for 'Float8_e4m3fn'

I'm quite sure stable fast has its own quantization node but it's not implemented in the node iirc

A new project could be published soon to support FP8 inference instead of stable-fast. I hope everyone will enjoy it.

That would be very welcome. I have seen fp8 support is getting traction recently in the vllm project. Would be nice to have it in diffusers/img gen as well. I will stay tuned. Thanks for the update!