CUDA out of memory

Question

CUDA out of memory

Aspector1 opened this issue 2 years ago · 6 comments

Hello, I'm trying to use YaLM to generate text. I am using pretrained models. But when I try to run the generation, I get an error:

RuntimeError: CUDA out of memory. Tried to allocate 76.00 MiB (GPU 0; 5.80 GiB total capacity; 62.50 MiB already allocated; 20.81 MiB free; 64.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

GPU is 1660, 6gb vram. Is there anything I can do about it or have I wasted a few weeks?

Answer 1 · 2022-08-14T13:24:54.000Z

The neural network requires 200 GB of video memory to run. Have you even looked into the details?

Answer 2 · 2022-08-14T16:07:09.000Z

The neural network requires 200 GB of video memory to run. Have you even looked into the details?

I'm not trying to retrain the model, I'm trying to use it.

Answer 3 · 2022-08-15T01:48:43.000Z

There is no difference.

Answer 4 · 2022-08-15T13:21:56.000Z

GPU is 1660, 6gb vram. Is there anything I can do about it or have I wasted a few weeks?

You may try to use huggingface-accelerate https://github.com/huggingface/accelerate https://github.com/huggingface/accelerate/blob/main/src/accelerate/big_modeling.py

Answer 5 · 2022-08-15T13:29:14.000Z

GPU is 1660, 6gb vram. Is there anything I can do about it or have I wasted a few weeks?

You may try to use huggingface-accelerate https://github.com/huggingface/accelerate https://github.com/huggingface/accelerate/blob/main/src/accelerate/big_modeling.py

Can you tell me more about how to load such a large model on the 1060?

Answer 6 · 2022-10-31T01:07:27.000Z

@Aspector1 by the way. Did you use docker to run it?