Does this include the GPTQ quantization tricks?

Question

Does this include the GPTQ quantization tricks?

vedantroy opened this issue 2 years ago · 0 comments

The GPTQ readme has the following:

which demonstrates two new tricks:--act-order (quantizing columns in order of decreasing activation size) and --true-sequential (performing sequential quantization even within a single Transformer block). Those fix GPTQ's strangely bad performance on the 7B model (from 7.15 to 6.09 Wiki2 PPL) and lead to slight improvements on most models/settings in general.

Does this repository use these tricks?