skeskinen/bert.cpp

Does this support CUDA?

SpaceCowboy850 opened this issue · 1 comments

I have seen where I can set the GGML_USE_CUBLAS, and I can follow the few #defines that activate the code, but the tensors are all on the CPU. I'm not seeing in bert.cpp where it would transfer the model or the inputs to the GPU.

Is this just not functioning yet?

I haven't done anything towards CUDA support. How easy/difficult it is to implement depends on ggml, I guess