
Int8 StableFusion model

Primary LanguagePython


Experiements on testing GaintModels such as GPT3, StableFusion. We offer TensorRT && Int8 quantization on those gaint models. Make you can inference on a 6GB below GPU mem card!


Some requirements to install:

pip install diffusers
pip install transformers
pip install alfred-py


  1. StableFusion:


Now the best way to accelerate StableFusion is using unet TensorRT, keep others in torch (their time is not critical). to export unet to onnx, run python export_unet.py.

Then you will have unet onnx. using trtexec --onnx=unet_v1_4_fp16_pytorch_sim.onnx --fp16 --saveEngine=unet_fp16.trt convert to fp16 trt engine.

Then you can run with trt unet:

python demo_part.py --trt

First, we need download stablefusion weights from hugging face.

git clone https://huggingface.co/CompVis/stable-diffusion-v1-4
git lfs install
cd stable-diffusion-v1-4
git lfs pull

You should downloading weights using git lfs large file system, the model about 3GB.

To make unet_2d_condition in stablefusion able to export to onnx, make some modification on diffusers, following: link

file: diffuers/models/unet_2d_conditions.py

# L137
timesteps = timesteps.broadcast_to(sample.shape[0])
#timesteps = timesteps.broadcast_to(sample.shape[0])
timesteps = timesteps * torch.ones(sample.shape[0])

output = {"sample": sample}
#output = {"sample": sample}

return output
return sample

After that, move stable-diffusion-v1-4 to weights folder. Run:

python export.py

To generate onnx models.