Quantization of models?
MilanBojic1999 opened this issue · 1 comments
MilanBojic1999 commented
Hi,
I am wondering if there is any plan to make quantization of chameleon models available.
I'm working with 3060 12gb and while trying to load 7b model, I get CUDA out of memory error.
Thank you!
lshamis commented
Unfortunately we have no plans to release a quantized model in the near future.