krishgoel/llama-cpp-fastapi-server
A simple implementation for running llama.cpp Python Wrapper on a FastAPI server instance for asynchronous local inference.
Python
Stargazers
No one’s star this repository yet.
A simple implementation for running llama.cpp Python Wrapper on a FastAPI server instance for asynchronous local inference.
Python
No one’s star this repository yet.