krishgoel/llama-cpp-fastapi-server

A simple implementation for running llama.cpp Python Wrapper on a FastAPI server instance for asynchronous local inference.

Python

Stargazers

No one’s star this repository yet.