FP8 support in stable fast
jkrauss82 opened this issue · 6 comments
Is it planned?
Currently getting this error when trying to run ComfyUI in fp8 (flags --fp8_e4m3fn-text-enc --fp8_e4m3fn-unet
):
RuntimeError: "addmm_cuda" not implemented for 'Float8_e4m3fn'
Is it planned?
Currently getting this error when trying to run ComfyUI in fp8 (flags
--fp8_e4m3fn-text-enc --fp8_e4m3fn-unet
):RuntimeError: "addmm_cuda" not implemented for 'Float8_e4m3fn'
I'm quite sure stable fast has its own quantization node but it's not implemented in the node iirc
@jkrauss82 Sorry, FP8 kernels aren't implemented and I guess I lack the time to support them now.
Thanks for the reply, understood. It would be nice if it could be supported eventually.
@jkrauss82 I have created one new project which supports FP8 inference with diffusers. However, it has not been open-sourced. I hope it could be made publicly soon...
Is it planned?
Currently getting this error when trying to run ComfyUI in fp8 (flags--fp8_e4m3fn-text-enc --fp8_e4m3fn-unet
):RuntimeError: "addmm_cuda" not implemented for 'Float8_e4m3fn'
I'm quite sure stable fast has its own quantization node but it's not implemented in the node iirc
A new project could be published soon to support FP8 inference instead of stable-fast
. I hope everyone will enjoy it.
That would be very welcome. I have seen fp8 support is getting traction recently in the vllm project. Would be nice to have it in diffusers/img gen as well. I will stay tuned. Thanks for the update!