6-bit quantization
philipturner opened this issue · 1 comments
philipturner commented
For smaller models, quantization causes more quality loss than large models. Could the repository try 6-bit / 128 groups for stuff like LLaMa-7B? This could be most useful for some of the smaller language networks in Stable Diffusion.
Ph0rk0z commented
Yes.. 6b would work great for 13b and below to make the model smarter.