how to make n_gpu_layers work with cuda?

Question

how to make n_gpu_layers work with cuda?

sprappcom opened this issue 4 months ago · 1 comments

used n_gpu_layers 99 but gpu is not being used

Answer 1 · 2024-07-09T12:19:07.000Z

Have a look at the Colab notebook that I link to on the README. You'll have to install llama-cpp-python with CUDA support, which effectively means running the following command after installing llama-zip:

CMAKE_ARGS="-DLLAMA_CUDA=on" FORCE_CMAKE=1 pip install llama-cpp-python --force-reinstall --no-cache-dir

Note that this may take a while (e.g., it takes around 15 minutes on a Colab T4 instance).