/llama-cpp-gpu

Load larger models by offloading model layers to both GPU and CPU

Primary LanguageJupyter NotebookMIT LicenseMIT

Stargazers