salesforce/progen

Weird behavior with memory usage within samply.py

GrayWasTaken opened this issue · 1 comments

So for some reason progen2-base starts to use an ungodly amount of VRAM the more I increase the value for --num-samples. If I set the value of --num-samples to 50 I get the following error. Yet if I set --num-samples to 30, 40, even 45, no issue occurs. I assume this is unintentional.

sampling
sampling took 36.29s
Traceback (most recent call last):
  File "sample.py", line 207, in <module>
    main()
  File "sample.py", line 193, in main
    completions = sample(device=device, model=model, tokenizer=tokenizer, context=args.context, pad_token_id=tokenizer.encode('<|pad|>').ids[0], num_return_sequences=args.num_samples, temp=args.t, top_p=args.p, max_length=args.max_length)
  File "sample.py", line 73, in sample
    tokens_batch = model.generate(input_ids, do_sample=True, temperature=temp, max_length=max_length, top_p=top_p, num_return_sequences=num_return_sequences, pad_token_id=pad_token_id)
  File "/home/.../.../progen/progen2/.venv/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/.../.../progen/progen2/.venv/lib/python3.7/site-packages/transformers/generation_utils.py", line 1210, in generate
    **model_kwargs,
  File "/home/.../.../progen/progen2/.venv/lib/python3.7/site-packages/transformers/generation_utils.py", line 1714, in sample
    output_hidden_states=output_hidden_states,
  File "/home/.../.../progen/progen2/.venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/.../.../progen/progen2/models/progen/modeling_progen.py", line 640, in forward
    return_dict=return_dict,
  File "/home/.../.../progen/progen2/.venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/.../.../progen/progen2/models/progen/modeling_progen.py", line 507, in forward
    output_attentions=output_attentions,
  File "/home/.../.../progen/progen2/.venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/.../.../progen/progen2/models/progen/modeling_progen.py", line 269, in forward
    output_attentions=output_attentions,
  File "/home/.../.../progen/progen2/.venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/.../.../progen/progen2/models/progen/modeling_progen.py", line 203, in forward
    value = torch.cat((past_value, value), dim=-2)
RuntimeError: CUDA out of memory. Tried to allocate 76.00 MiB (GPU 0; 14.76 GiB total capacity; 13.21 GiB already allocated; 37.75 MiB free; 13.82 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

For reference this is the command I'm running

python sample.py --model progen2-base --t 0.8 --p 90 --max-length 512 --num-samples 40 --context <232 AA sequence>

Hi!

I am getting the exact same error "RuntimeError: CUDA out of memory" and I am using largemodel.
I tried everything but nothing is working to solve this annoying issue.
https://discuss.pytorch.org/search?q=cuda%20out%20of%20memory

I also cleared cache and others from here https://medium.com/@snk.nitin/how-to-solve-cuda-out-of-memory-error-850bb247cfb2
but its not working.

with torch.no_grad()
pytorch/pytorch#16417
is present in line 1339 /modules but still not working.

I think If Dr. Madani could reduce the batch size etc something to solve this error "RuntimeError: CUDA out of memory".

I assume an upgrade to PyTorch 2.0 in the requirements.txt could solve this issue although I did not tried yet.

Thank you