aredden/flux-fp8-api
Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.
PythonApache-2.0
Issues
- 3
- 6
- 1
- 1
when load certain lora, AttributeError: 'Flux' object has no attribute 'diffusion_model' happened.
#34 opened by fyepi - 1
Certain lora not applied correctly.
#36 opened by fyepi - 1
Acceleration not as expected
#35 opened by alecyan1993 - 1
- 5
Where is the code about "remaining layers use faster half precision accumulate"?
#10 opened by goldhuang - 2
Issue: torch._scaled_mm RuntimeError on RTX 6000 (with runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04)
#30 opened by veyorokon - 2
- 0
A question regarding whether the LoRA has been successfully applied to the inference process?
#29 opened by zhangqi420 - 2
Any plans for controlnet + inpainting support?
#15 opened by 0xtempest - 7
Potential LoRA performance issue
#9 opened by ashakoen - 17
Hot Lora Replacement
#18 opened by Lantianyou - 2
[bug]UnboundLocalError: cannot access local variable 'temp_77_token_ids' where it is not associated with a value
#23 opened by 81549361 - 4
TypeError: NoneType takes no arguments
#25 opened by lvjin521 - 3
Load a LORA using the API
#20 opened by acaladolopes - 4
The speed of drawing is not satisfactory
#26 opened by lvjin521 - 4
Why is vae decoder so slow? Can you help me?
#27 opened by radish0926 - 21
- 5
How to save a "prequantized_flow" safetensor?
#16 opened by smuelpeng - 0
PuLID support
#19 opened by 81549361 - 5
Docker image support.
#17 opened by ShivamB25 - 3
- 2
No issue - just a thank you!
#4 opened by ashakoen - 4
Consider adding a license to the code
#12 opened by flowpoint - 2
Error No module named 'cublas_ops'
#5 opened by ankitsiliconithub - 2
- 3