6-bit quantization

Question

6-bit quantization

philipturner opened this issue 2 years ago · 1 comments

For smaller models, quantization causes more quality loss than large models. Could the repository try 6-bit / 128 groups for stuff like LLaMa-7B? This could be most useful for some of the smaller language networks in Stable Diffusion.

Answer 1 · 2023-05-21T14:05:35.000Z

Yes.. 6b would work great for 13b and below to make the model smarter.