deep-diver/LLM-As-Chatbot

Getting torch.cuda.OutOfMemoryError: CUDA out of memory

imranraad07 opened this issue · 1 comments

When running 30B version, getting error when executing line 18 of model.py.

model = PeftModel.from_pretrained(model, finetuned, device_map={'': 0})

If device_map={'': 0} is removed, then no error on loading the model.

I am using a server that has 7 GPUs and each of them 32GB.

Is the code supports multiple GPUs? If not, is it possible to make it multiple GPU supported?

sorry, I was not targeting multi GPUs environment at the moment.

if you think you can, please propose a PR :)