How to run CogVideoX1.5-5B-I2V as int8?
Opened this issue · 3 comments
I am running it like below and still using 22 GB VRAM and very slow on RTX 3090
What I am doing wrong?
import torch
from diffusers import AutoencoderKLCogVideoX, CogVideoXTransformer3DModel, CogVideoXImageToVideoPipeline
from diffusers.utils import export_to_video, load_image
from transformers import T5EncoderModel
from torchao.quantization import quantize_, int8_weight_only
quantization = int8_weight_only
text_encoder = T5EncoderModel.from_pretrained("THUDM/CogVideoX1.5-5B-I2V", subfolder="text_encoder",
torch_dtype=torch.bfloat16)
quantize_(text_encoder, quantization())
transformer = CogVideoXTransformer3DModel.from_pretrained("THUDM/CogVideoX1.5-5B-I2V", subfolder="transformer",
torch_dtype=torch.bfloat16)
quantize_(transformer, quantization())
vae = AutoencoderKLCogVideoX.from_pretrained("THUDM/CogVideoX1.5-5B-I2V", subfolder="vae", torch_dtype=torch.bfloat16)
quantize_(vae, quantization())
# Create pipeline and run inference
pipe = CogVideoXImageToVideoPipeline.from_pretrained(
"THUDM/CogVideoX1.5-5B-I2V",
text_encoder=text_encoder,
transformer=transformer,
vae=vae,
torch_dtype=torch.bfloat16,
)
pipe.enable_model_cpu_offload()
pipe.vae.enable_tiling()
pipe.vae.enable_slicing()
prompt = "a fast car"
image = load_image(image="input.png")
video = pipe(
prompt=prompt,
image=image,
num_videos_per_prompt=1,
num_inference_steps=50,
num_frames=24,
guidance_scale=6,
generator=torch.Generator(device="cuda").manual_seed(42),
).frames[0]
export_to_video(video, "output.mp4", fps=8)
running this way made it use 8 gb . do i set the video fps in the pipe?
prompt = "a fast car"
image = load_image(image="input.png")
video = pipe(
prompt=prompt,
height=480,
width=720,
image=image,
num_videos_per_prompt=1,
num_inference_steps=50,
num_frames=12,
guidance_scale=6,
generator=torch.Generator(device="cuda").manual_seed(42),
).frames[0]
Necessary, otherwise the default is 49, which is 8 * 6 + 1 frame(not for CogVideoX1.5-5B), please adjust and run each parameter according to cli_demo. Thank you.
@zRzRzRzRzRzRzR amazing work i made it work for huge speed and low vram on windows
however do we need to prompt is a mystery can you guide me?
here example
i used this prompt : A highly detailed, majestic dragon, with shimmering orange and white scales, slowly turns its head to gaze intently with a piercing golden eye, as glowing embers drift softly in the air around it, creating a magical, slightly mysterious atmosphere in a blurred forest background.