kabachuha/sd-webui-text2video

[Bug]: Not enough memory (Tried to allocate 52428800/4194304/19660800 bytes)

Huey-Lewy opened this issue · 3 comments

Is there an existing issue for this?

  • I have searched the existing issues and checked the recent builds/commits of both this extension and the webui

Are you using the latest version of the extension?

  • I have the modelscope text2video extension updated to the lastest version and I still have the issue.

What happened?

Running into not enough memory problems, can't figure out why.
It was working before but not anymore for some reason.

Model Type: ModelScope
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 52428800 bytes.

Model Type: VideoCrafter (WIP)
Note: Refer to Settings section Image 1
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 4194304 bytes.

Model Type: VideoCrafter (WIP)
Note: Refer to Settings section Image 2
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 19660800 bytes.

Steps to reproduce the problem

  1. Go to the "text2video" tab
  2. Select "ModelScope" or "VideoCrafter (WIP)" as the Model type
  3. Put "anime girl dancing" in the prompt as a test
  4. Generate
  5. Click "Update the video"
  6. Get an error video instead of the expected result.

What should have happened?

It should've generated a video, but instead, it ran into a memory error
.
For this bug report purposes, I did "anime girl dancing" as the prompt. Instead of it generating the video based on the prompt, it shows a video showing "If you see this video it means an error check your logs".

WebUI and Deforum extension Commit IDs

WebUI commit id - v1.3.0-72-gb957dcfe
txt2vid commit id - a8937bafb63867d15bd5e62afe31f431fcde1c49

Torch version

torch: 2.0.1+cu118

What GPU were you using for launching?

ASUS TUF Gaming GeForce RTX 3080 Ti with 12GB of VRAM

On which platform are you launching the webui backend with the extension?

Local PC setup (Windows)

Settings

(Here are the settings I've tried using for figuring this stuff out)
image
image

Console logs

venv "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\venv\Scripts\Python.exe"
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec  6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Version: v1.3.0-72-gb957dcfe
Commit hash: b957dcfece29c84ac0cfcd5a69475ff8684c531f
Installing requirements

Launching Web UI with arguments:
No module 'xformers'. Proceeding without it.
Loading weights [89d59c3dde] from C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\models\Stable-diffusion\animefull-final-pruned\nai.ckpt
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 7.2s (import torch: 2.2s, import gradio: 1.3s, import ldm: 0.6s, other imports: 1.0s, load scripts: 1.4s, create ui: 0.4s, gradio launch: 0.1s).
Creating model from config: C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\models\Stable-diffusion\animefull-final-pruned\nai.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Loading VAE weights found near the checkpoint: C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\models\Stable-diffusion\nai.vae.pt
Applying attention optimization: sdp-no-mem... done.
Textual inversion embeddings loaded(0):
Model loaded in 6.5s (load weights from disk: 2.6s, create model: 1.0s, apply weights to model: 1.0s, apply half(): 0.7s, load VAE: 0.6s, move model to device: 0.5s).
text2video — The model selected is:  ModelScope
 text2video extension for auto1111 webui
Git commit: a8937baf
Starting text2video
Pipeline setup
config namespace(framework='pytorch', task='text-to-video-synthesis', model={'type': 'latent-text-to-video-synthesis', 'model_args': {'ckpt_clip': 'open_clip_pytorch_model.bin', 'ckpt_unet': 'text2video_pytorch_model.pth', 'ckpt_autoencoder': 'VQGAN_autoencoder.pth', 'max_frames': 16, 'tiny_gpu': 1}, 'model_cfg': {'unet_in_dim': 4, 'unet_dim': 320, 'unet_y_dim': 768, 'unet_context_dim': 1024, 'unet_out_dim': 4, 'unet_dim_mult': [1, 2, 4, 4], 'unet_num_heads': 8, 'unet_head_dim': 64, 'unet_res_blocks': 2, 'unet_attn_scales': [1, 0.5, 0.25], 'unet_dropout': 0.1, 'temporal_attention': 'True', 'num_timesteps': 1000, 'mean_type': 'eps', 'var_type': 'fixed_small', 'loss_type': 'mse'}}, pipeline={'type': 'latent-text-to-video-synthesis'})
Error verifying pickled file from C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\models/ModelScope/t2v\text2video_pytorch_model.pth:
Traceback (most recent call last):
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\modules\safe.py", line 136, in load_with_extra
    check_pt(filename, extra_handler)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\modules\safe.py", line 94, in check_pt
    unpickler.load()
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\torch\_utils.py", line 169, in _rebuild_tensor_v2
    tensor = _rebuild_tensor(storage, storage_offset, size, stride)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\torch\_utils.py", line 148, in _rebuild_tensor
    return t.set_(storage._untyped_storage, storage_offset, size, stride)
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 52428800 bytes.


The file may be malicious, so the program is not going to read it.
You can skip this check with --disable-safe-unpickle commandline argument.


Traceback (most recent call last):
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\t2v_helpers\render.py", line 24, in run
    vids_pack = process_modelscope(args_dict)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\modelscope\process_modelscope.py", line 55, in process_modelscope
    pipe = setup_pipeline()
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\modelscope\process_modelscope.py", line 26, in setup_pipeline
    return TextToVideoSynthesis(ph.models_path + '/ModelScope/t2v')
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\modelscope\t2v_pipeline.py", line 85, in __init__
    self.sd_model.load_state_dict(
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1994, in load_state_dict
    raise TypeError("Expected state_dict to be dict-like, got {}.".format(type(state_dict)))
TypeError: Expected state_dict to be dict-like, got <class 'NoneType'>.
Exception occurred: Expected state_dict to be dict-like, got <class 'NoneType'>.
text2video — The model selected is:  VideoCrafter (WIP)
 text2video extension for auto1111 webui
Git commit: a8937baf
VideoCrafter config:
 {'model': {'target': 'lvdm.models.ddpm3d.LatentDiffusion', 'params': {'linear_start': 0.00085, 'linear_end': 0.012, 'num_timesteps_cond': 1, 'log_every_t': 200, 'timesteps': 1000, 'first_stage_key': 'video', 'cond_stage_key': 'caption', 'image_size': [32, 32], 'video_length': 16, 'channels': 4, 'cond_stage_trainable': False, 'conditioning_key': 'crossattn', 'scale_by_std': False, 'scale_factor': 0.18215, 'unet_config': {'target': 'lvdm.models.modules.openaimodel3d.UNetModel', 'params': {'image_size': 32, 'in_channels': 4, 'out_channels': 4, 'model_channels': 320, 'attention_resolutions': [4, 2, 1], 'num_res_blocks': 2, 'channel_mult': [1, 2, 4, 4], 'num_heads': 8, 'transformer_depth': 1, 'context_dim': 768, 'use_checkpoint': True, 'legacy': False, 'kernel_size_t': 1, 'padding_t': 0, 'temporal_length': 16, 'use_relative_position': True}}, 'first_stage_config': {'target': 'lvdm.models.autoencoder.AutoencoderKL', 'params': {'embed_dim': 4, 'monitor': 'val/rec_loss', 'ddconfig': {'double_z': True, 'z_channels': 4, 'resolution': 256, 'in_channels': 3, 'out_ch': 3, 'ch': 128, 'ch_mult': [1, 2, 4, 4], 'num_res_blocks': 2, 'attn_resolutions': [], 'dropout': 0.0}, 'lossconfig': {'target': 'torch.nn.Identity'}}}, 'cond_stage_config': {'target': 'lvdm.models.modules.condition_modules.FrozenCLIPEmbedder'}}}}
Loading model from C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\models/VideoCrafter/model.ckpt
LatentDiffusion: Running in eps-prediction mode
Successfully initialize the diffusion model !
DiffusionWrapper has 958.92 M params.
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
Error verifying pickled file from C:\Users\Herculean/.cache\huggingface\hub\models--openai--clip-vit-large-patch14\snapshots\8d052a0f05efbaefbc9e8786ba291cfdf93e5bff\pytorch_model.bin:
Traceback (most recent call last):
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\modules\safe.py", line 136, in load_with_extra
    check_pt(filename, extra_handler)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\modules\safe.py", line 94, in check_pt
    unpickler.load()
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\torch\_utils.py", line 169, in _rebuild_tensor_v2
    tensor = _rebuild_tensor(storage, storage_offset, size, stride)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\torch\_utils.py", line 148, in _rebuild_tensor
    return t.set_(storage._untyped_storage, storage_offset, size, stride)
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 4194304 bytes.


The file may be malicious, so the program is not going to read it.
You can skip this check with --disable-safe-unpickle commandline argument.


Traceback (most recent call last):
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\t2v_helpers\render.py", line 26, in run
    vids_pack = process_videocrafter(args_dict)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\process_videocrafter.py", line 41, in process_videocrafter
    model, _, _ = load_model(config, ph.models_path+'/VideoCrafter/model.ckpt', #TODO: support safetensors and stuff
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\sample_utils.py", line 27, in load_model
    model = instantiate_from_config(config.model)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\lvdm\utils\common_utils.py", line 112, in instantiate_from_config
    return get_obj_from_str(config["target"])(**config.get("params", dict()))
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\lvdm\models\ddpm3d.py", line 538, in __init__
    self.instantiate_cond_stage(cond_stage_config)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\lvdm\models\ddpm3d.py", line 624, in instantiate_cond_stage
    model = instantiate_from_config(config)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\lvdm\utils\common_utils.py", line 112, in instantiate_from_config
    return get_obj_from_str(config["target"])(**config.get("params", dict()))
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\lvdm\models\modules\condition_modules.py", line 20, in __init__
    self.transformer = CLIPTextModel.from_pretrained(version)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\transformers\modeling_utils.py", line 2258, in from_pretrained
    loaded_state_dict_keys = [k for k in state_dict.keys()]
AttributeError: 'NoneType' object has no attribute 'keys'
Exception occurred: 'NoneType' object has no attribute 'keys'
text2video — The model selected is:  VideoCrafter (WIP)
 text2video extension for auto1111 webui
Git commit: a8937baf
VideoCrafter config:
 {'model': {'target': 'lvdm.models.ddpm3d.LatentDiffusion', 'params': {'linear_start': 0.00085, 'linear_end': 0.012, 'num_timesteps_cond': 1, 'log_every_t': 200, 'timesteps': 1000, 'first_stage_key': 'video', 'cond_stage_key': 'caption', 'image_size': [32, 32], 'video_length': 16, 'channels': 4, 'cond_stage_trainable': False, 'conditioning_key': 'crossattn', 'scale_by_std': False, 'scale_factor': 0.18215, 'unet_config': {'target': 'lvdm.models.modules.openaimodel3d.UNetModel', 'params': {'image_size': 32, 'in_channels': 4, 'out_channels': 4, 'model_channels': 320, 'attention_resolutions': [4, 2, 1], 'num_res_blocks': 2, 'channel_mult': [1, 2, 4, 4], 'num_heads': 8, 'transformer_depth': 1, 'context_dim': 768, 'use_checkpoint': True, 'legacy': False, 'kernel_size_t': 1, 'padding_t': 0, 'temporal_length': 16, 'use_relative_position': True}}, 'first_stage_config': {'target': 'lvdm.models.autoencoder.AutoencoderKL', 'params': {'embed_dim': 4, 'monitor': 'val/rec_loss', 'ddconfig': {'double_z': True, 'z_channels': 4, 'resolution': 256, 'in_channels': 3, 'out_ch': 3, 'ch': 128, 'ch_mult': [1, 2, 4, 4], 'num_res_blocks': 2, 'attn_resolutions': [], 'dropout': 0.0}, 'lossconfig': {'target': 'torch.nn.Identity'}}}, 'cond_stage_config': {'target': 'lvdm.models.modules.condition_modules.FrozenCLIPEmbedder'}}}}
Loading model from C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\models/VideoCrafter/model.ckpt
LatentDiffusion: Running in eps-prediction mode
Successfully initialize the diffusion model !
DiffusionWrapper has 958.92 M params.
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
Traceback (most recent call last):
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\t2v_helpers\render.py", line 26, in run
    vids_pack = process_videocrafter(args_dict)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\process_videocrafter.py", line 41, in process_videocrafter
    model, _, _ = load_model(config, ph.models_path+'/VideoCrafter/model.ckpt', #TODO: support safetensors and stuff
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\sample_utils.py", line 27, in load_model
    model = instantiate_from_config(config.model)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\lvdm\utils\common_utils.py", line 112, in instantiate_from_config
    return get_obj_from_str(config["target"])(**config.get("params", dict()))
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\lvdm\models\ddpm3d.py", line 538, in __init__
    self.instantiate_cond_stage(cond_stage_config)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\lvdm\models\ddpm3d.py", line 624, in instantiate_cond_stage
    model = instantiate_from_config(config)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\lvdm\utils\common_utils.py", line 112, in instantiate_from_config
    return get_obj_from_str(config["target"])(**config.get("params", dict()))
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\lvdm\models\modules\condition_modules.py", line 20, in __init__
    self.transformer = CLIPTextModel.from_pretrained(version)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\transformers\modeling_utils.py", line 2276, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 771, in __init__
    self.text_model = CLIPTextTransformer(config)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 678, in __init__
    self.encoder = CLIPEncoder(config)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 581, in __init__
    self.layers = nn.ModuleList([CLIPEncoderLayer(config) for _ in range(config.num_hidden_layers)])
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 581, in <listcomp>
    self.layers = nn.ModuleList([CLIPEncoderLayer(config) for _ in range(config.num_hidden_layers)])
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 356, in __init__
    self.mlp = CLIPMLP(config)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 340, in __init__
    self.fc1 = nn.Linear(config.hidden_size, config.intermediate_size)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\linear.py", line 96, in __init__
    self.weight = Parameter(torch.empty((out_features, in_features), **factory_kwargs))
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 9437184 bytes.
Exception occurred: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 9437184 bytes.
text2video — The model selected is:  VideoCrafter (WIP)
 text2video extension for auto1111 webui
Git commit: a8937baf
VideoCrafter config:
 {'model': {'target': 'lvdm.models.ddpm3d.LatentDiffusion', 'params': {'linear_start': 0.00085, 'linear_end': 0.012, 'num_timesteps_cond': 1, 'log_every_t': 200, 'timesteps': 1000, 'first_stage_key': 'video', 'cond_stage_key': 'caption', 'image_size': [32, 32], 'video_length': 16, 'channels': 4, 'cond_stage_trainable': False, 'conditioning_key': 'crossattn', 'scale_by_std': False, 'scale_factor': 0.18215, 'unet_config': {'target': 'lvdm.models.modules.openaimodel3d.UNetModel', 'params': {'image_size': 32, 'in_channels': 4, 'out_channels': 4, 'model_channels': 320, 'attention_resolutions': [4, 2, 1], 'num_res_blocks': 2, 'channel_mult': [1, 2, 4, 4], 'num_heads': 8, 'transformer_depth': 1, 'context_dim': 768, 'use_checkpoint': True, 'legacy': False, 'kernel_size_t': 1, 'padding_t': 0, 'temporal_length': 16, 'use_relative_position': True}}, 'first_stage_config': {'target': 'lvdm.models.autoencoder.AutoencoderKL', 'params': {'embed_dim': 4, 'monitor': 'val/rec_loss', 'ddconfig': {'double_z': True, 'z_channels': 4, 'resolution': 256, 'in_channels': 3, 'out_ch': 3, 'ch': 128, 'ch_mult': [1, 2, 4, 4], 'num_res_blocks': 2, 'attn_resolutions': [], 'dropout': 0.0}, 'lossconfig': {'target': 'torch.nn.Identity'}}}, 'cond_stage_config': {'target': 'lvdm.models.modules.condition_modules.FrozenCLIPEmbedder'}}}}
Loading model from C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\models/VideoCrafter/model.ckpt
LatentDiffusion: Running in eps-prediction mode
Successfully initialize the diffusion model !
DiffusionWrapper has 958.92 M params.
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
Error verifying pickled file from C:\Users\Herculean/.cache\huggingface\hub\models--openai--clip-vit-large-patch14\snapshots\8d052a0f05efbaefbc9e8786ba291cfdf93e5bff\pytorch_model.bin:
Traceback (most recent call last):
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\modules\safe.py", line 136, in load_with_extra
    check_pt(filename, extra_handler)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\modules\safe.py", line 94, in check_pt
    unpickler.load()
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\torch\_utils.py", line 169, in _rebuild_tensor_v2
    tensor = _rebuild_tensor(storage, storage_offset, size, stride)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\torch\_utils.py", line 148, in _rebuild_tensor
    return t.set_(storage._untyped_storage, storage_offset, size, stride)
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 16777216 bytes.


The file may be malicious, so the program is not going to read it.
You can skip this check with --disable-safe-unpickle commandline argument.


Traceback (most recent call last):
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\t2v_helpers\render.py", line 26, in run
    vids_pack = process_videocrafter(args_dict)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\process_videocrafter.py", line 41, in process_videocrafter
    model, _, _ = load_model(config, ph.models_path+'/VideoCrafter/model.ckpt', #TODO: support safetensors and stuff
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\sample_utils.py", line 27, in load_model
    model = instantiate_from_config(config.model)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\lvdm\utils\common_utils.py", line 112, in instantiate_from_config
    return get_obj_from_str(config["target"])(**config.get("params", dict()))
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\lvdm\models\ddpm3d.py", line 538, in __init__
    self.instantiate_cond_stage(cond_stage_config)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\lvdm\models\ddpm3d.py", line 624, in instantiate_cond_stage
    model = instantiate_from_config(config)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\lvdm\utils\common_utils.py", line 112, in instantiate_from_config
    return get_obj_from_str(config["target"])(**config.get("params", dict()))
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\videocrafter\lvdm\models\modules\condition_modules.py", line 20, in __init__
    self.transformer = CLIPTextModel.from_pretrained(version)
  File "C:\Users\Herculean\Desktop\Folders\Programs & Scripts\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\transformers\modeling_utils.py", line 2258, in from_pretrained
    loaded_state_dict_keys = [k for k in state_dict.keys()]
AttributeError: 'NoneType' object has no attribute 'keys'
Exception occurred: 'NoneType' object has no attribute 'keys'

Additional information

I'm very new trying out things on my own with Stable Diffusion. I've followed the exact instructions given by TroubleChute's video tutorial and for the first couple of times with VideoCrafter it was working. After I did a restart, it suddenly refused to do anything due to memory issues.

wzgrx commented

Error loading script: api_t2v.py
Traceback (most recent call last):
File "G:\stable-diffusion-webui\modules\scripts.py", line 263, in load_scripts
script_module = script_loading.load_module(scriptfile.path)
File "G:\stable-diffusion-webui\modules\script_loading.py", line 10, in load_module
module_spec.loader.exec_module(module)
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "G:\stable-diffusion-webui\extensions\sd-webui-text2video\scripts\api_t2v.py", line 35, in
from t2v_helpers.args import T2VArgs_sanity_check, T2VArgs, T2VOutputArgs
File "G:\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\t2v_helpers\args.py", line 7, in
from modelscope.t2v_model import has_torch2
ImportError: cannot import name 'has_torch2' from 'modelscope.t2v_model' (G:\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\modelscope\t2v_model.py)

Error loading script: text2vid.py
Traceback (most recent call last):
File "G:\stable-diffusion-webui\modules\scripts.py", line 263, in load_scripts
script_module = script_loading.load_module(scriptfile.path)
File "G:\stable-diffusion-webui\modules\script_loading.py", line 10, in load_module
module_spec.loader.exec_module(module)
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "G:\stable-diffusion-webui\extensions\sd-webui-text2video\scripts\text2vid.py", line 20, in
from t2v_helpers.render import run
File "G:\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\t2v_helpers\render.py", line 5, in
from modelscope.process_modelscope import process_modelscope
File "G:\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\modelscope\process_modelscope.py", line 21, in
from t2v_helpers.args import get_outdir, process_args
File "G:\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\t2v_helpers\args.py", line 7, in
from modelscope.t2v_model import has_torch2
ImportError: cannot import name 'has_torch2' from 'modelscope.t2v_model' (G:\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\modelscope\t2v_model.py)

Error loading script: api_t2v.py Traceback (most recent call last): File "G:\stable-diffusion-webui\modules\scripts.py", line 263, in load_scripts script_module = script_loading.load_module(scriptfile.path) File "G:\stable-diffusion-webui\modules\script_loading.py", line 10, in load_module module_spec.loader.exec_module(module) File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed File "G:\stable-diffusion-webui\extensions\sd-webui-text2video\scripts\api_t2v.py", line 35, in from t2v_helpers.args import T2VArgs_sanity_check, T2VArgs, T2VOutputArgs File "G:\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\t2v_helpers\args.py", line 7, in from modelscope.t2v_model import has_torch2 ImportError: cannot import name 'has_torch2' from 'modelscope.t2v_model' (G:\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\modelscope\t2v_model.py)

Error loading script: text2vid.py Traceback (most recent call last): File "G:\stable-diffusion-webui\modules\scripts.py", line 263, in load_scripts script_module = script_loading.load_module(scriptfile.path) File "G:\stable-diffusion-webui\modules\script_loading.py", line 10, in load_module module_spec.loader.exec_module(module) File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed File "G:\stable-diffusion-webui\extensions\sd-webui-text2video\scripts\text2vid.py", line 20, in from t2v_helpers.render import run File "G:\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\t2v_helpers\render.py", line 5, in from modelscope.process_modelscope import process_modelscope File "G:\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\modelscope\process_modelscope.py", line 21, in from t2v_helpers.args import get_outdir, process_args File "G:\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\t2v_helpers\args.py", line 7, in from modelscope.t2v_model import has_torch2 ImportError: cannot import name 'has_torch2' from 'modelscope.t2v_model' (G:\stable-diffusion-webui/extensions/sd-webui-text2video/scripts\modelscope\t2v_model.py)

Why are you sending an error in your console in my bug report?

It seems the issue is that unpickler is catching this as potentially malicious:

Error verifying pickled file from C:\Users\Herculean/.cache\huggingface\hub\models--openai--clip-vit-large-patch14\snapshots\8d052a0f05efbaefbc9e8786ba291cfdf93e5bff\pytorch_model.bin:

I've seen other people have similar problems with the safety check. First thing I would try is finding that file and deleting it. It could have only downloaded partially or somehow been corrupted. The file should download automatically again. See if that fixes your issue.

That being said, I'm not certain both tracebacks are related, but it's a good place to start.