/nitro

An inference server on top of llama.cpp. OpenAI-compatible API, queue, & scaling. Embed a prod-ready, local inference engine in your apps. Powers Jan

Primary LanguageC++GNU Affero General Public License v3.0AGPL-3.0

Stargazers

No one’s star this repository yet.