Issue with integrating with AutoGPTQ
Closed this issue · 2 comments
Steindox commented
I try the AutoGPTQ with Bitblas to quanntize Llama3 , and always encounter the problem RuntimeError: shape '[2048, 2048]' is invalid for input of size 2097152
the error log
...
return self.forward(*args, **kwds)
File "/root/anaconda3/envs/bitblas/lib/python3.10/site-packages/bitblas/ops/operator.py", line 373, in forward
inputs = [op.forward(*inputs)]
File "/root/anaconda3/envs/bitblas/lib/python3.10/site-packages/bitblas/ops/quant_compress/__init__.py", line 52, in forward
args = [inp.view((self.M, self.N)), out]
RuntimeError: shape '[2048, 2048]' is invalid for input of size 2097152
LeiWang1999 commented
Hi, we've integrated bitblas with gptqmodel (which is the latest version of gptq), did you check out this repo?
LeiWang1999 commented
Closed, and we recommend use vllm project to evaluate gptq
model format via bitblas :)