max_memory?
fabianocastello opened this issue · 2 comments
I'm quite new to LLM world. I'm brazilian and decided to start with Cabrita. I managed to deal with several errors but i am stuck with this:
"Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit
the quantized model. If you have set a value for max_memory
you should increase that. To have
an idea of the modules that are set on the CPU or RAM you can print model.hf_device_map."
I tried several solutions but none work. Is it a memory problem?
I have a MacBook Pro i7 16Gb (2018)
Probably your issue really is a lack of memory, since the minimum required to run this model (with 7 billion parameters) is around 12gb of VRAM (GPU RAM memory). You can try using GPUs from google colab, or use a computer with at least 32gb of RAM and a video card with at least 12gb of VRAM
I was having the same problem trying to fine tune the alpaca-lora, to solve it i had to subscribe to Colab Pro to use more powerful GPUs and more RAM. I didn't try to run the cabrita-lora.ipynb before that, but try to run it in the free version of Colab, if you get an error the solution may be the same one I took