A simple, no-nonsense local CLI-based LLM inferencing tool, with lancedb integration for RAG support and function calling.
Although every library providing LLM invocation/inferencing capabilities exposes a CLI interface, like, llama-cpp, rustformers/llm, basic RAG+TTS support is missing. VocLLM does not aim to be the most flexible, or configurable solution in this regard, but is instead a quick-and-dirty tool for testing out RAG systems, and voice-interfaced LLM.
A python implementation leveraging llama-cpp
, and lancedb
bindings would be trivial, and suitably satisfactory.
However, I chose Rust, because everything is better in Rust. (Trust me)
- Mistral Support
- Chat Template
- Local Native TTS
- RWKV Support
- LanceDB RAG
- Function Calling