I am try to run this model with ViedoLQ dataset. But StableVSR occur OOM in raft_large model.
DaramGC opened this issue · 2 comments
DaramGC commented
I am try to run this model with ViedoLQ dataset. But StableVSR occur OOM in raft_large model.
How can I shrink memory usage. Thank you.
Error Message
/home/hjh9902/.conda/envs/stablevsr/lib/python3.8/site-packages/huggingface_hub/file_download.py:1150: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Loading pipeline components...: 100%|██████████| 6/6 [00:26<00:00, 4.34s/it]
You have disabled the safety checker for <class 'pipeline.stablevsr_pipeline.StableVSRPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
/home/hjh9902/.conda/envs/stablevsr/lib/python3.8/site-packages/huggingface_hub/file_download.py:1150: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Traceback (most recent call last):
File "test.py", line 64, in <module>
frames = pipeline('', frames, num_inference_steps=args.num_inference_steps, guidance_scale=0, of_model=of_model).images
File "/home/hjh9902/.conda/envs/stablevsr/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/hjh9902/evaluation/models/StableVSR/pipeline/stablevsr_pipeline.py", line 962, in __call__
forward_flows, backward_flows = self.compute_flows(of_model, upscaled_images, rescale_factor=of_rescale_factor)
File "/home/hjh9902/evaluation/models/StableVSR/pipeline/stablevsr_pipeline.py", line 712, in compute_flows
bflow = of.get_flow(of_model, prev_image, cur_image, rescale_factor=rescale_factor)
File "/home/hjh9902/evaluation/models/StableVSR/util/flow_utils.py", line 43, in get_flow
flows = of_model(target, source)
File "/home/hjh9902/.conda/envs/stablevsr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/hjh9902/.conda/envs/stablevsr/lib/python3.8/site-packages/torchvision/models/optical_flow/raft.py", line 484, in forward
self.corr_block.build_pyramid(fmap1, fmap2)
File "/home/hjh9902/.conda/envs/stablevsr/lib/python3.8/site-packages/torchvision/models/optical_flow/raft.py", line 372, in build_pyramid
corr_volume = self._compute_corr_volume(fmap1, fmap2)
File "/home/hjh9902/.conda/envs/stablevsr/lib/python3.8/site-packages/torchvision/models/optical_flow/raft.py", line 416, in _compute_corr_volume
corr = torch.matmul(fmap1.transpose(1, 2), fmap2)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 21.97 GiB (GPU 0; 79.19 GiB total capacity; 39.70 GiB already allocated; 21.95 GiB free; 55.73 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
claudiom4sir commented
Hi, I also experienced this issue with raft large when upscaling bigger frames. You can try with
raft_small
instead of raft_large.
See the documentation here. Even if the large version was used during training, it shouldn't change much.
DaramGC commented
Thx!