ExponentialML/Text-To-Video-Finetuning

model inference of version2

suzhenghang opened this issue · 7 comments

After the fine-tuning of version 2 is completed, how to perform model inference? version1 is as the following:

import torch
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler
from diffusers.utils import export_to_video

my_trained_model_path = "./trained_model_path/"
pipe = DiffusionPipeline.from_pretrained(my_trained_model_path, torch_dtype=torch.float16, variant="fp16")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe.enable_model_cpu_offload()

prompt = "Your prompt based on train data"
video_frames = pipe(prompt, num_inference_steps=25).frames

out_file = "./my_video.mp4"
video_path = export_to_video(video_frames, out_file)

After the fine-tuning of version 2 is completed, how to perform model inference? version1 is as the following:

import torch from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler from diffusers.utils import export_to_video

my_trained_model_path = "./trained_model_path/" pipe = DiffusionPipeline.from_pretrained(my_trained_model_path, torch_dtype=torch.float16, variant="fp16") pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config) pipe.enable_model_cpu_offload()

prompt = "Your prompt based on train data" video_frames = pipe(prompt, num_inference_steps=25).frames

out_file = "./my_video.mp4" video_path = export_to_video(video_frames, out_file)

Im using the same way from version 1 and it seems to work correctly.

mark

After the fine-tuning of version 2 is completed, how to perform model inference? version1 is as the following:

import torch from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler from diffusers.utils import export_to_video

my_trained_model_path = "./trained_model_path/" pipe = DiffusionPipeline.from_pretrained(my_trained_model_path, torch_dtype=torch.float16, variant="fp16") pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config) pipe.enable_model_cpu_offload()

prompt = "Your prompt based on train data" video_frames = pipe(prompt, num_inference_steps=25).frames

out_file = "./my_video.mp4" video_path = export_to_video(video_frames, out_file)

Did you manage to finetune it correctly? The finetuning process makes no difference to me, it leaves the model identical to how it was before, no changes.

I used Accelerate for distributed training, which may have differences when saving checkpoints. I need to check it. Training and validation are both fine. However, when using version 1 of the code, there's an error about initialize the models as followings:
log.txt

@ExponentialML any suggestions ?

@ExponentialML any suggestions ?

Hey all, apologies for the training nuances. I've been following this thread and found that there was a bug in the way they checkpoints were being saved. The latest release should fix these issues, as well as the addition of full LoRa training.

@ExponentialML Nice, inference code is running normally now.