qwopqwop200/GPTQ-for-LLaMa

Errors to compile with CUDA 12.1

Closed this issue · 2 comments

This might or might not be an issue.

I had a problem trying to install this with the cuda 12 using the instruction

TORCH_CUDA_ARCH_LIST="7.5" python setup_cuda.py install

and I got the error describe #68 , However before I cross arround that issue I tried to install pytorch with cuda 12.1 support using

pip install --require-virtualenv --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121

And then tried to install with the setup_cuda.py but I ran into a lot of errors.

Granted the torch I have is a nightly build, there are still no stable torch with cuda12 support so it might be a problem with torch not being ready. However it could also be something that might need to be addressed in order to give support for cuda12.

It is possible to get it working with 12.1, unfortunately it's been a while so I don't remember all the details.

I did have to edit a system wide header file for a pybind11 error.

But 'a lot of errors.' isn't really descriptive :/

Thanks a lot, that helped.

just as a minor comment, if someone does not want to temper with system files, I am ussing a virtual environment, so the file I needed to change is venv/lib/python3.11/site-packages/torch/include/pybind11/cast.h and apply the change:

-    return caster.operator typename make_caster<T>::template cast_op_type<T>();
+    return caster;

At the end of the day I was correct it is not a GPTQ-for-LLaMa issue, it is a CUDA issue.