How to use GPU to run the modes, like to set "-ngl N, -- n-Gpu-layers N" in the llama.cpp project?

Question

How to use GPU to run the modes, like to set "-ngl N, -- n-Gpu-layers N" in the llama.cpp project?

vo1d07 opened this issue 2 years ago · 2 comments

Here's what I've tried in my project:

ModelParameters modelParams = new ModelParameters
                .Builder()
                .setNGpuLayers(nGpuLayers)
                .build();

'nGpuLayers' is a integer which is the same value as when I used in llama.cpp project. However I found in the task manager that the GPU seems to not work at all when the model is running, may I ask why and thank you!

Answer 1 · 2023-12-08T10:05:56.000Z

Hi, you are probably using the pre-compiled llama.cpp library of this repository. We currently only provide support for CPU inference since there are too many ways to compile the library. For GPU support, you have to compile the library yourself. Please refer to https://github.com/kherud/java-llama.cpp#setup-required

Answer 2 · 2023-12-09T03:57:48.000Z

Hi, you are probably using the pre-compiled llama.cpp library of this repository. We currently only provide support for CPU inference since there are too many ways to compile the library. For GPU support, you have to compile the library yourself. Please refer to https://github.com/kherud/java-llama.cpp#setup-required

Thank you so much!