This is a straightforward, zero-dependency CLI tool to interact with any LLM service.
It is available in several flavors:
- Python version. Compatible with CPython or PyPy, v3.10 or higher.
- JavaScript version. Compatible with Node.js (>= v18) or Bun (>= v1.0).
- Clojure version. Compatible with Babashka (>= 1.3).
- Go version. Compatible with Go, v1.19 or higher.
Once a suitable inference engine is set up (local or remote, read the next section), interact with the LLM:
./ask-llm.py # for Python user
./ask-llm.js # for Node.js user
./ask-llm.clj # for Clojure user
go run ask-llm.go # for Go user
or pipe the question directly to get an immediate answer:
echo "Why is the sky blue?" | ./ask-llm.py
or request the LLM to perform a certain task:
echo "Translate into German: thank you" | ./ask-llm.py
To use it locally with llama.cpp inference engine, make sure to load a quantized model (example: TinyLLama, Gemma 2B, OpenHermes 2.5, etc) with the suitable chat template. Set the environment variable LLM_API_BASE_URL
accordingly:
~/llama.cpp/server -m gemma-2b-it-q4_k_m.gguf --chat-template gemma
export LLM_API_BASE_URL=http://127.0.0.1:8080/v1
To use it locally with Nitro, follow its Quickstart guide to load a model (e.g. TinyLLama, OpenHermes 2.5, etc) and set the environment variable LLM_API_BASE_URL
:
export LLM_API_BASE_URL=http://localhost:3928/v1
To use it locally with Ollama, load a model and set the environment variable LLM_API_BASE_URL
:
ollama pull gemma:2b
export LLM_API_BASE_URL=http://127.0.0.1:11434/v1
export LLM_CHAT_MODEL='gemma:2b'
To use it locally with LocalAI, launch its container and the set environment variable LLM_API_BASE_URL
:
docker run -ti -p 8080:8080 localai/localai tinyllama-chat
export LLM_API_BASE_URL=http://localhost:3928/v1
To use OpenAI GPT model, set the environment variable OPENAI_API_KEY
to your API key:
export OPENAI_API_KEY="sk-yourownapikey"
To use it with other LLM services, populate relevant environment variables as shown in these examples:
export LLM_API_BASE_URL=https://api.deepinfra.com/v1/openai
export LLM_API_KEY="yourownapikey"
export LLM_CHAT_MODEL="mistralai/Mistral-7B-Instruct-v0.1"
export LLM_API_BASE_URL=https://api.fireworks.ai/inference/v1
export LLM_API_KEY="yourownapikey"
export LLM_CHAT_MODEL="accounts/fireworks/models/mistral-7b-instruct-4k"
export LLM_API_BASE_URL=https://api.groq.com/openai/v1
export LLM_API_KEY="yourownapikey"
export LLM_CHAT_MODEL="gemma-7b-it"
export LLM_API_BASE_URL=https://mixtral-8x7b.lepton.run/api/v1/
export LLM_API_KEY="yourownapikey"
export LLM_API_BASE_URL=https://openrouter.ai/api/v1
export LLM_API_KEY="sk-yourownapikey"
export LLM_CHAT_MODEL="mistralai/mistral-7b-instruct"
export LLM_API_BASE_URL=https://api.together.xyz/v1
export LLM_API_KEY="sk-yourownapikey"
export LLM_CHAT_MODEL="mistralai/Mistral-7B-Instruct-v0.2"