huggingface/local-gemma

8bit quantization

paolo-losi opened this issue · 1 comments

Would it be possible to support 8bit quantization?

Hi @paolo-losi, we try to keep the number of args low. Hence we decided to go with 4bit quantization with the memory preset. Is there an issue with the quality of the 4-bit model ?