/llama-cpp-gpu

Load larger models by offloading model layers to both GPU and CPU

Primary LanguageJupyter NotebookMIT LicenseMIT

LLaMA.cpp GPU

Offloads some of the model layers to the GPU, allowing larger models to be loaded

model colab resources parameters