Loading compiled pipe seems to triggered a recompilation
w1ndseeker opened this issue · 2 comments
w1ndseeker commented
Describe the bug
when i test onediffx diffusers save_and_load example. There's some warnings show
Input structure key None to b47b96 has changed. Resetting the deployable module graph. This may slow down the process.
...
and the time cost nearly equal to onediff online compilation 66s(load_pipe) compare to 63s(online compile)
Your environment
print(torch.utils.collect_env.get_pretty_env_info())
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.2 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: 14.0.0-1ubuntu1.1
CMake version: version 3.26.4
Libc version: glibc-2.35
Python version: 3.10.14 (main, May 6 2024, 19:42:50) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-6.5.0-41-generic-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: 12.5.82
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 4090
Nvidia driver version: 555.42.06
OneDiff git commit id
onediff 1.2.0.dev202406280130
onediffx 1.2.0.dev1
OneFlow version info if you have installed oneflow
path: ['/home/ws/miniconda3/envs/of/lib/python3.10/site-packages/oneflow']
version: 0.9.1.dev20240627+cu121
git_commit: ec7b682
cmake_build_type: Release
rdma: True
mlir: True
enterprise: False
How To Reproduce
The code
# Command to run save: python test_pipe_compile_save_load.py --save
# Command to run load: python test_pipe_compile_save_load.py --load
import argparse
import torch
from diffusers import StableDiffusionXLPipeline
from onediffx import compile_pipe, save_pipe, load_pipe
parser = argparse.ArgumentParser()
parser.add_argument(
"--model", type=str, default="stabilityai/stable-diffusion-xl-base-1.0"
)
parser.add_argument("--save", action=argparse.BooleanOptionalAction)
parser.add_argument("--load", action=argparse.BooleanOptionalAction)
cmd_args = parser.parse_args()
import time
start = time.perf_counter()
pipe = StableDiffusionXLPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
torch_dtype=torch.float16,
variant="fp16",
use_safetensors=True
)
pipe.to("cuda")
# compile the pipe
pipe = compile_pipe(pipe)
if cmd_args.load:
# load the compiled pipe
load_pipe(pipe, dir="cached_pipe")
# If the pipe is not loaded, it will takes seconds to do real compilation.
# If the pipe is loaded, it will run fast.
image = pipe(
prompt="street style, detailed, raw photo, woman, face, shot on CineStill 800T",
height=512,
width=512,
num_inference_steps=30,
output_type="pil",
).images
print(f"Time taken: {time.perf_counter()-start} seconds")
image[0].save(f"test_image.png")
if cmd_args.save:
# save the compiled pipe
save_pipe(pipe, dir="cached_pipe")
logs:
python save_and_load.py --save ─╯
/home/ws/miniconda3/envs/of/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Loading pipeline components...: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 9.88it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:37<00:00, 1.24s/it]
Time taken: 63.03123291092925 seconds
╰─ python save_and_load.py --load
/home/ws/miniconda3/envs/of/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Loading pipeline components...: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 9.73it/s]
WARNING [2024-07-19 15:49:26] /home/ws/miniconda3/envs/of/lib/python3.10/site-packages/onediff/infer_compiler/backends/oneflow/args_tree_util.py:59 - Input structure key None to b47b96 has changed. Resetting the deployable module graph. This may slow down the process.
WARNING [2024-07-19 15:49:29] /home/ws/miniconda3/envs/of/lib/python3.10/site-packages/onediff/infer_compiler/backends/oneflow/args_tree_util.py:59 - Input structure key None to b47b96 has changed. Resetting the deployable module graph. This may slow down the process.
0%| | 0/30 [00:00<?, ?it/s]WARNING [2024-07-19 15:49:38] /home/ws/miniconda3/envs/of/lib/python3.10/site-packages/onediff/infer_compiler/backends/oneflow/args_tree_util.py:59 - Input structure key None to 32274a has changed. Resetting the deployable module graph. This may slow down the process.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:36<00:00, 1.21s/it]
WARNING [2024-07-19 15:50:14] /home/ws/miniconda3/envs/of/lib/python3.10/site-packages/onediff/infer_compiler/backends/oneflow/args_tree_util.py:59 - Input structure key None to c4612e has changed. Resetting the deployable module graph. This may slow down the process.
Time taken: 66.22534257895313 seconds
ccssu commented
Please update onediff to resolve the issue @w1ndseeker , as it has already been fixed and corresponds to PR #1005
w1ndseeker commented
Please update onediff to resolve the issue @w1ndseeker , as it has already been fixed and corresponds to PR #1005
thanks for the fix. After update it works properly😊