himanshu-ntml/nitro

An inference server on top of llama.cpp. OpenAI-compatible API, queue, & scaling. Embed a prod-ready, local inference engine in your apps. Powers Jan

C++AGPL-3.0

Stargazers

No one’s star this repository yet.