/ask-llm

Interact with an LLM service

Primary LanguageJavaScriptMIT LicenseMIT

Ask LLM

asciicast

This is a straightforward, zero-dependency CLI tool to interact with any LLM service.

It is available in several flavors:

  • Python version. Compatible with CPython or PyPy, v3.10 or higher.
  • JavaScript version. Compatible with Node.js (>= v18) or Bun (>= v1.0).
  • Clojure version. Compatible with Babashka (>= 1.3).
  • Go version. Compatible with Go, v1.19 or higher.

Once a suitable inference engine is set up (local or remote, read the next section), interact with the LLM:

./ask-llm.py         # for Python user
./ask-llm.js         # for Node.js user
./ask-llm.clj        # for Clojure user
go run ask-llm.go    # for Go user

or pipe the question directly to get an immediate answer:

echo "Why is the sky blue?" | ./ask-llm.py

or request the LLM to perform a certain task:

echo "Translate into German: thank you" | ./ask-llm.py

To use it locally with llama.cpp inference engine, make sure to load a quantized model (example: TinyLLama, Gemma 2B, OpenHermes 2.5, etc) with the suitable chat template. Set the environment variable LLM_API_BASE_URL accordingly:

~/llama.cpp/server -m gemma-2b-it-q4_k_m.gguf --chat-template gemma
export LLM_API_BASE_URL=http://127.0.0.1:8080/v1

To use it locally with Nitro, follow its Quickstart guide to load a model (e.g. TinyLLama, OpenHermes 2.5, etc) and set the environment variable LLM_API_BASE_URL:

export LLM_API_BASE_URL=http://localhost:3928/v1

To use it locally with Ollama, load a model and set the environment variable LLM_API_BASE_URL:

ollama pull gemma:2b
export LLM_API_BASE_URL=http://127.0.0.1:11434/v1
export LLM_CHAT_MODEL='gemma:2b'

To use it locally with LocalAI, launch its container and the set environment variable LLM_API_BASE_URL:

docker run -ti -p 8080:8080 localai/localai tinyllama-chat
export LLM_API_BASE_URL=http://localhost:3928/v1

To use OpenAI GPT model, set the environment variable OPENAI_API_KEY to your API key:

export OPENAI_API_KEY="sk-yourownapikey"

To use it with other LLM services, populate relevant environment variables as shown in these examples:

export LLM_API_BASE_URL=https://api.deepinfra.com/v1/openai
export LLM_API_KEY="yourownapikey"
export LLM_CHAT_MODEL="mistralai/Mistral-7B-Instruct-v0.1"
export LLM_API_BASE_URL=https://api.fireworks.ai/inference/v1
export LLM_API_KEY="yourownapikey"
export LLM_CHAT_MODEL="accounts/fireworks/models/mistral-7b-instruct-4k"
export LLM_API_BASE_URL=https://api.groq.com/openai/v1
export LLM_API_KEY="yourownapikey"
export LLM_CHAT_MODEL="gemma-7b-it"
export LLM_API_BASE_URL=https://mixtral-8x7b.lepton.run/api/v1/
export LLM_API_KEY="yourownapikey"
export LLM_API_BASE_URL=https://openrouter.ai/api/v1
export LLM_API_KEY="sk-yourownapikey"
export LLM_CHAT_MODEL="mistralai/mistral-7b-instruct"
export LLM_API_BASE_URL=https://api.together.xyz/v1
export LLM_API_KEY="sk-yourownapikey"
export LLM_CHAT_MODEL="mistralai/Mistral-7B-Instruct-v0.2"