Issue with flexgen when running python script

Question

Issue with flexgen when running python script

PsoriasiIR opened this issue 2 years ago · 2 comments

Description:

I encountered an issue when running the following command at one 3090:

bash:
python3 -m flexgen.flex_opt --model facebook/opt-30b --percent 0 100 100 0 100 0 --num-gpu-batches 2
The error message I received is:

error:
model size: 55.803 GB, cache size: 5.578 GB, hidden size (prefill): 0.058 GB
warmup - init weights
Load the pre-trained pytorch weights of opt-30b from huggingface. The downloading and cpu loading can take dozens of minutes. If it seems to get stuck, you can monitor the progress by checking the memory usage of this process.
Loading checkpoint shards: 43%|█████████████████████████████████████████▏ | 3/7 [03:34<04:48, 72.03s/it]Killed
I am trying to use flexgen to optimize the model size, but the process seems to be getting killed midway through. I am not sure why this is happening, and I would appreciate any help in resolving this issue.

Thank you!

Answer 1 · 2023-02-26T10:31:59.000Z

This should be fixed by #69. It is merged into the main branch. Could you try it now?

Answer 2 · 2023-02-26T10:33:28.000Z

duplication of #11