microsoft/BitBLAS

Issue with integrating with AutoGPTQ

Closed this issue · 2 comments

I try the AutoGPTQ with Bitblas to quanntize Llama3 , and always encounter the problem RuntimeError: shape '[2048, 2048]' is invalid for input of size 2097152

the error log

...
    return self.forward(*args, **kwds)
  File "/root/anaconda3/envs/bitblas/lib/python3.10/site-packages/bitblas/ops/operator.py", line 373, in forward
    inputs = [op.forward(*inputs)]
  File "/root/anaconda3/envs/bitblas/lib/python3.10/site-packages/bitblas/ops/quant_compress/__init__.py", line 52, in forward
    args = [inp.view((self.M, self.N)), out]
RuntimeError: shape '[2048, 2048]' is invalid for input of size 2097152

Hi, we've integrated bitblas with gptqmodel (which is the latest version of gptq), did you check out this repo?

Closed, and we recommend use vllm project to evaluate gptq model format via bitblas :)