"Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions."

Question

"Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions."

Closed this issue 2 months ago · 0 comments

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

Install EasyDiffusion
Enter prompt
wait for generation of image
bug

Expected behavior
I was expecting to generate an image as usual as been able to for months but recent months some update has broken the memory usage somehow. It might be a torch fault or NVIDIA's. I am unsure. I have updated drivers and running latest Windows version.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: Microsoft Windows 11 Pro NT Version 23H2 (OS Build 22631.3447)
Browser: Mozilla Firefox
Version: v3
GPU: RTX 4060 8GB

Additional context
It seems to be fairly recent issue. As stated above, I've had no such issues when I built my computer a few months ago. But this year, and a few months back it started breaking down. And this issue is related to Stable Diffusion as I tried installation 1111's web UI as well but it also breaks down at a similar point.

Terminal output
`Install dir: C:\EasyDiffusion
C:\EasyDiffusion\installer_files\env\Library\bin\git.exe
C:\Program Files\Git\cmd\git.exe
git version 2.42.0.windows.1
C:\EasyDiffusion\installer_files\env\Library\bin\conda.bat
C:\EasyDiffusion\installer_files\env\Scripts\conda.exe
conda 23.7.3
.
COMSPEC=C:\Windows\system32\cmd.exe
AdapterRAM DriverDate DriverVersion Name
4293918720 20240411000000.000000-000 31.0.15.5222 NVIDIA GeForce RTX 4060

"Easy Diffusion - v3"

"Easy Diffusion's git repository was already installed. Updating from main.."
No local changes to save
HEAD is now at dfb26ed Merge pull request #1702 from easydiffusion/beta
Already on 'main'
Your branch is up to date with 'origin/main'.
Already up to date.
519 File(s) copied
1 file(s) copied.
1 file(s) copied.
1 file(s) copied.
1 file(s) copied.
1 file(s) copied.
1 file(s) copied.
1 file(s) copied.
1 file(s) copied.
1 file(s) copied.
1 file(s) copied.
A subdirectory or file tmp already exists.
PYTHONPATH=C:\EasyDiffusion\installer_files\env\lib\site-packages
C:\EasyDiffusion\installer_files\env\python.exe
C:\Users\Username\AppData\Local\Programs\Python\Python310\python.exe
C:\Users\Username\AppData\Local\Programs\Python\Python312\python.exe
C:\Users\Username\AppData\Local\Microsoft\WindowsApps\python.exe
Python 3.8.5
torch: 2.0.1+cu117
torchvision: 0.15.2+cu117
sdkit: 2.0.15
stable-diffusion-sdkit: 2.1.5
{'model': {'stable-diffusion': 'absolutereality_v181',
'vae': 'vae-ft-mse-840000-ema-pruned'},
'models_dir': 'C:\EasyDiffusion\models',
'net': {'listen_port': 9000, 'listen_to_network': False},
'render_devices': 'auto',
'ui': {'open_browser_on_start': True},
'update_branch': 'main',
'use_v3_engine': True,
'vram_usage_level': 'balanced'}

Easy Diffusion installation complete, starting the server!

PYTHONPATH=C:\EasyDiffusion\installer_files\env\lib\site-packages
Python: C:\EasyDiffusion\installer_files\env\python.EXE
Version: 3.8.5
Checking network settings
Set listen port to 9000

Launching uvicorn

07:33:21.777 INFO MainThread started in C:\EasyDiffusion\stable-diffusion server.py:31
07:33:21.781 INFO MainThread started at 04/28/24 07:33:21 server.py:32
stable-diffusion model(s) found.
gfpgan model(s) found.
realesrgan model(s) found.
vae model(s) found.
07:33:22.183 INFO MainThread Start new Rendering Thread on device: cuda:0 task_manager.py:408
07:33:22.191 INFO cuda:0 Device usage during initialization: runtime.py:35
07:33:22.196 INFO cuda:0 CPU utilization: 3.8%, System RAM used: 10.6 of 31.9 GiB, GPU RAM used memory_utils.py:47
(cuda:0): 1.1 of 8.0 GiB (peak: 0.0 GiB)
07:33:22.198 INFO cuda:0 Setting cuda:0 as active, with precision: half device_manager.py:154
07:33:22.919 INFO cuda:0 loading stable-diffusion model from init.py:52
C:\EasyDiffusion\models\stable-diffusion\absolutereality_v181.safetensors to device: cuda:0
No module 'xformers'. Proceeding without it.
07:33:23.103 INFO cuda:0 loading on diffusers init.py:174
07:33:23.104 INFO cuda:0 using config: init.py:176
C:\EasyDiffusion\installer_files\env\lib\site-packages\sdkit\models\models_db\configs\v1-inference.yaml
07:33:23.120 INFO cuda:0 using attn_precision: fp16 init.py:192
07:33:23.235 INFO MainThread Opening browser.. app.py:318
╭────────────────────────────────────────────── Easy Diffusion is ready ───────────────────────────────────────────────╮
│ │
│ Easy Diffusion is ready to serve requests. │
│ │
│ A new browser tab should have been opened by now. │
│ If not, please open your web browser and navigate to http://localhost:9000/ │
│ │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
07:33:24.548 INFO AnyIO worker thread Scanning all model folders for models... model_manager.py:414
07:33:25.146 INFO AnyIO worker thread Scanned 12 models. Nothing infected model_manager.py:425
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at huggingface/diffusers#254 .
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion_img2img.StableDiffusionImg2ImgPipeline'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at huggingface/diffusers#254 .
C:\EasyDiffusion\installer_files\env\lib\site-packages\diffusers\pipelines\stable_diffusion\pipeline_stable_diffusion_inpaint_legacy.py:144: FutureWarning: The class <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion_inpaint_legacy.StableDiffusionInpaintPipelineLegacy'> is deprecated and will be removed in v1.0.0. You can achieve exactly the same functionalityby loading your model into StableDiffusionInpaintPipeline instead. See https://github.com/huggingface/diffusers/pull/3533for more information.
deprecate("legacy is outdated", "1.0.0", deprecation_message, standard_warn=False)
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion_inpaint_legacy.StableDiffusionInpaintPipelineLegacy'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at huggingface/diffusers#254 .
07:33:26.983 INFO cuda:0 Loaded on diffusers init.py:436
07:33:27.175 INFO cuda:0 loaded stable-diffusion model from init.py:56
C:\EasyDiffusion\models\stable-diffusion\absolutereality_v181.safetensors to device: cuda:0
07:33:27.181 INFO cuda:0 loading vae model from init.py:52
C:\EasyDiffusion\models\vae\vae-ft-mse-840000-ema-pruned.ckpt to device: cuda:0
C:\EasyDiffusion\installer_files\env\lib\site-packages\sdkit\models\model_loader\vae.py:38: FutureWarning: Accessing config attribute sample_size directly via 'AutoencoderKL' object attribute is deprecated. Please access 'sample_size' over 'AutoencoderKL's config object instead, e.g. 'unet.config.sample_size'.
image_size = m.vae.sample_size
07:33:27.265 INFO cuda:0 Loading diffusers vae vae.py:44
07:33:27.336 INFO cuda:0 loaded vae model from init.py:56
C:\EasyDiffusion\models\vae\vae-ft-mse-840000-ema-pruned.ckpt to device: cuda:0
07:33:38.251 INFO cuda:0 Session 1714282403941 starting task 2321109965072 on NVIDIA GeForce RTX task_manager.py:280
4060
07:33:38.313 INFO cuda:0 loading gfpgan model from C:\EasyDiffusion\models\gfpgan\GFPGANv1.4.pth to init.py:52
device: cuda:0
C:\EasyDiffusion\installer_files\env\lib\site-packages\torchvision\transforms\functional_tensor.py:5: UserWarning: The torchvision.transforms.functional_tensor module is deprecated in 0.15 and will be removed in 0.17. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional.
warnings.warn(
C:\EasyDiffusion\installer_files\env\lib\site-packages\torchvision\models_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
C:\EasyDiffusion\installer_files\env\lib\site-packages\torchvision\models_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing weights=None.
warnings.warn(msg)
07:33:40.178 INFO cuda:0 loaded gfpgan model from C:\EasyDiffusion\models\gfpgan\GFPGANv1.4.pth to init.py:56
device: cuda:0
07:33:40.180 INFO cuda:0 loading realesrgan model from init.py:52
C:\EasyDiffusion\models\realesrgan\RealESRGAN_x4plus.pth to device: cuda:0
07:33:41.118 INFO cuda:0 loaded realesrgan model from init.py:56
C:\EasyDiffusion\models\realesrgan\RealESRGAN_x4plus.pth to device: cuda:0
07:33:41.274 INFO cuda:0 unloaded stable-diffusion model from device: cuda:0 init.py:87
07:33:41.276 INFO cuda:0 loading stable-diffusion model from init.py:52
C:\EasyDiffusion\models\stable-diffusion\absolutereality_v181.safetensors to device: cuda:0
07:33:41.447 INFO cuda:0 loading on diffusers init.py:174
07:33:41.448 INFO cuda:0 using config: init.py:176
C:\EasyDiffusion\installer_files\env\lib\site-packages\sdkit\models\models_db\configs\v1-inference.yaml
07:33:41.464 INFO cuda:0 using attn_precision: fp16 init.py:192
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at huggingface/diffusers#254 .
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion_img2img.StableDiffusionImg2ImgPipeline'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at huggingface/diffusers#254 .
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion_inpaint_legacy.StableDiffusionInpaintPipelineLegacy'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at huggingface/diffusers#254 .
07:33:44.620 INFO cuda:0 Loaded on diffusers init.py:436
07:33:44.802 INFO cuda:0 loaded stable-diffusion model from init.py:56
C:\EasyDiffusion\models\stable-diffusion\absolutereality_v181.safetensors to device: cuda:0
07:33:45.372 INFO cuda:0 unloaded vae model from device: cuda:0 init.py:87
07:33:45.374 INFO cuda:0 loading vae model from init.py:52
C:\EasyDiffusion\models\vae\vae-ft-mse-840000-ema-pruned.ckpt to device: cuda:0
07:33:45.469 INFO cuda:0 Loading diffusers vae vae.py:44
07:33:45.562 INFO cuda:0 loaded vae model from init.py:56
C:\EasyDiffusion\models\vae\vae-ft-mse-840000-ema-pruned.ckpt to device: cuda:0
07:33:45.576 INFO cuda:0 request: {'clip_skip': True, render_images.py:179
'guidance_scale': 7.5,
'height': 768,
'negative_prompt': 'old, wrinkles, ugly, low quality, sad, crying',
'num_inference_steps': 60,
'prompt': 'rtx, best quality, photograph of a man',
'sampler_name': 'dpmpp_2m',
'seed': 1769469827,
'tiling': None,
'upscale_amount': 2,
'use_controlnet_model': None,
'use_embeddings_model': None,
'use_face_correction': 'GFPGANv1.4',
'use_lora_model': None,
'use_stable_diffusion_model': 'absolutereality_v181',
'use_upscale': 'RealESRGAN_x4plus',
'use_vae_model': 'vae-ft-mse-840000-ema-pruned',
'width': 768}
07:33:45.582 INFO cuda:0 task data: {'block_nsfw': False, render_images.py:180
'clip_skip': True,
'codeformer_fidelity': 0.5,
'codeformer_upscale_faces': False,
'control_filter_to_apply': None,
'enable_vae_tiling': True,
'filter_params': {'realesrgan': {'scale': 2}},
'filters': ['gfpgan', 'realesrgan'],
'latent_upscaler_steps': 10,
'request_id': 2321109965072,
'session_id': '1714282403941',
'show_only_filtered_image': True,
'stream_image_progress': False,
'stream_image_progress_interval': 5,
'upscale_amount': 2,
'use_controlnet_model': None,
'use_embeddings_model': None,
'use_face_correction': 'GFPGANv1.4',
'use_hypernetwork_model': None,
'use_lora_model': None,
'use_stable_diffusion_model': 'absolutereality_v181',
'use_upscale': 'RealESRGAN_x4plus',
'use_vae_model': 'vae-ft-mse-840000-ema-pruned',
'vram_usage_level': 'balanced'}
07:33:45.591 INFO cuda:0 output format: {'output_format': 'png', 'output_lossless': False, render_images.py:182
'output_quality': 75}
07:33:45.593 INFO cuda:0 save data: {'metadata_output_format': 'none', render_images.py:183
'save_to_disk_path': 'C:\Users\Username\Stable Diffusion UI'}
07:33:45.596 INFO cuda:0 Global seed set to 1769469827 seed.py:65
07:33:45.603 INFO cuda:0 Using sampler: DPMSolverMultistepScheduler { image_generator.py:331
"_class_name": "DPMSolverMultistepScheduler",
"_diffusers_version": "0.20.2",
"algorithm_type": "dpmsolver++",
"beta_end": 0.012,
"beta_schedule": "scaled_linear",
"beta_start": 0.00085,
"clip_sample": false,
"dynamic_thresholding_ratio": 0.995,
"lambda_min_clipped": -Infinity,
"lower_order_final": true,
"num_train_timesteps": 1000,
"prediction_type": "epsilon",
"sample_max_value": 1.0,
"set_alpha_to_one": false,
"solver_order": 2,
"solver_type": "midpoint",
"steps_offset": 1,
"thresholding": false,
"timestep_spacing": "linspace",
"trained_betas": null,
"use_karras_sigmas": true,
"variance_type": null
}
because of dpmpp_2m
07:33:45.610 INFO cuda:0 Applying tiling settings image_generator.py:358
07:33:45.615 INFO cuda:0 Parsing the prompt... image_generator.py:402
07:33:45.617 INFO cuda:0 compel is ready image_generator.py:406
07:33:46.137 INFO cuda:0 Made prompt embeds image_generator.py:428
07:33:46.237 INFO cuda:0 Made negative prompt embeds image_generator.py:431
07:33:46.287 INFO cuda:0 Done parsing the prompt image_generator.py:437
07:33:46.289 INFO cuda:0 applying: StableDiffusionPipeline { image_generator.py:448
"_class_name": "StableDiffusionPipeline",
"_diffusers_version": "0.20.2",
"feature_extractor": [
null,
null
],
"requires_safety_checker": true,
"safety_checker": [
null,
null
],
"scheduler": [
"diffusers",
"DPMSolverMultistepScheduler"
],
"text_encoder": [
"transformers",
"CLIPTextModel"
],
"tokenizer": [
"transformers",
"CLIPTokenizer"
],
"unet": [
"diffusers",
"UNet2DConditionModel"
],
"vae": [
"diffusers",
"AutoencoderKL"
]
}

07:33:46.306 INFO cuda:0 Running on diffusers: {'guidance_scale': 7.5, 'generator': image_generator.py:449
<torch._C.Generator object at 0x0000021C6CD50DD0>, 'width': 768, 'height': 768,
'num_inference_steps': 60, 'num_images_per_prompt': 1, 'callback': <function
make_with_diffusers.. at 0x0000021C6CCFAD30>, 'prompt_embeds': tensor([[[-0.3916,
0.0289, -0.0723, ..., -0.4939, -0.3120, 0.0659],
[-0.0391, -0.2725, -0.9487, ..., -1.2822, -0.3435, -0.3989],
[ 0.0518, 1.9111, -0.6919, ..., -1.1494, 1.6523, 0.0962],
...,
[-0.9590, -0.0153, -0.7661, ..., -0.6963, -0.2949, 0.1448],
[-0.9458, -0.0275, -0.7783, ..., -0.7021, -0.2871, 0.1411],
[-0.9131, 0.0562, -0.7178, ..., -0.7349, -0.2744, 0.0648]]],
device='cuda:0', dtype=torch.float16), 'negative_prompt_embeds': tensor([[[-0.3916,
0.0289, -0.0723, ..., -0.4939, -0.3120, 0.0659],
[-0.9434, 0.8955, -0.6763, ..., 0.1123, -2.6289, -0.1826],
[-1.2021, 0.9248, 1.9102, ..., -0.1401, -1.3408, 0.5996],
...,
[-1.1152, 0.7305, -0.1289, ..., -0.9619, -0.3145, -0.4160],
[-1.0938, 0.7397, -0.1584, ..., -0.9404, -0.3086, -0.4272],
[-1.1006, 0.7480, -0.0366, ..., -0.9497, -0.3604, -0.5020]]],
device='cuda:0', dtype=torch.float16)}
2%|█▍ | 1/60 [00:01<01:55, 1.96s/it]
07:33:48.304 ERROR cuda:0 Traceback (most recent call last): task_manager.py:292
File "C:\EasyDiffusion\ui\easydiffusion\task_manager.py", line 284, in thread_render
task.run()
File "C:\EasyDiffusion\ui\easydiffusion\tasks\render_images.py", line 92, in run
self.response = make_images(
File "C:\EasyDiffusion\ui\easydiffusion\tasks\render_images.py", line 152, in make_images
images, seeds = make_images_internal(
File "C:\EasyDiffusion\ui\easydiffusion\tasks\render_images.py", line 197, in make_images_internal
images, user_stopped = generate_images_internal(
File "C:\EasyDiffusion\ui\easydiffusion\tasks\render_images.py", line 288, in
generate_images_internal
images = generate_images(context, callback=callback, **req.dict())
File "C:\EasyDiffusion\installer_files\env\lib\site-packages\sdkit\generate\image_generator.py",
line 71, in generate_images
return make_with_diffusers(
File "C:\EasyDiffusion\installer_files\env\lib\site-packages\sdkit\generate\image_generator.py",
line 457, in make_with_diffusers
images = operation_to_apply(**cmd).images
File "C:\EasyDiffusion\installer_files\env\lib\site-packages\torch\utils_contextlib.py", line
115, in decorate_context
return func(*args, **kwargs)
File
"C:\EasyDiffusion\installer_files\env\lib\site-packages\diffusers\pipelines\stable_diffusion\pipelin
e_stable_diffusion.py", line 695, in call
latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs, return_dict=False)[0]
File
"C:\EasyDiffusion\installer_files\env\lib\site-packages\diffusers\schedulers\scheduling_dpmsolver_mu
ltistep.py", line 659, in step
step_index = (self.timesteps == timestep).nonzero()
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below
might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Exception in thread cuda:0:
Traceback (most recent call last):
File "C:\EasyDiffusion\installer_files\env\lib\threading.py", line 932, in _bootstrap_inner
self.run()
File "C:\EasyDiffusion\installer_files\env\lib\threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "C:\EasyDiffusion\ui\easydiffusion\task_manager.py", line 294, in thread_render
gc(runtime.context)
File "C:\EasyDiffusion\installer_files\env\lib\site-packages\sdkit\utils\memory_utils.py", line 18, in gc
torch.cuda.empty_cache()
File "C:\EasyDiffusion\installer_files\env\lib\site-packages\torch\cuda\memory.py", line 133, in empty_cache
torch._C._cuda_emptyCache()
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.`