rotemtzaban/STIT

CUDA out of memory

puoti opened this issue · 6 comments

puoti commented

Got this error:

RuntimeError: CUDA out of memory. Tried to allocate 2.54 GiB (GPU 0; 6.00 GiB total capacity; 425.36 MiB already allocated; 1.55 GiB free; 2.99 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

How can I fix this?
Is there somewhere a variable to reduce the batch size or something.

Couldnt find the batch variable, try reducing the number of images in the dataset.

@puoti
Hi,
During which script does this error occur?
The current batch size is 1, so it can't be reduced. The models used are just rather large, but maybe there are options to lower the memory consumption (depending on the script).
I'll try to see if there are any simple memory optimizations I can make.

puoti commented

@rotemtzaban
Hi,
the error occured during the "Editing + Stitching Tuning" process.
I attached a picture where you can see it better.
During the Inversion process, everything went well and as output I got the "model_patrick.pt" file.
The error occured during this process:
python edit_video_stitching_tuning.py --input_folder input --output_folder output --run_name Patrick --edit_name smile --edit_range 2 2 1 --outer_mask_dilation 50 --border_loss_threshold 0.005

Here the picture: https://ibb.co/dr8BGKT

@puoti
Probably the simplest way to reduce the memory consumption here is to decrease outer_mask_dilation.
Changing it to say around 30 might be enough to solve your problem.
I have only done a few tests with different values of this parameter, but I expect results to be similar/equivalent.

puoti commented

@rotemtzaban
It worked. Thank you very much.

@rotemtzaban Running into the same error during the inversion process.
RuntimeError: CUDA out of memory. Tried to allocate 128.00 MiB (GPU 0; 3.82 GiB total capacity; 2.57 GiB already allocated; 122.00 MiB free; 2.78 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I tried reducing the number of images in the dataset but no luck.