/knn-router

Primary LanguageGoApache License 2.0Apache-2.0

KNN Router

A minimal server for generating a ranked list of targets, for a query, based on its k-nearest semantic neighbors. Written in Go.

KNN-router can be used within a larger system to route natural language queries to the right system, with minimal latency. Given a user query, KNN-router:

  1. Looks up the most semantically similar neighbors from a curated corpus of example utterances
  2. Computes a weighted average score (based on the distances between query and top-K example utterances) for each target associated with the respective utterances. Targets can be information retrieval systems, agents, LORA adapters, Small/Large Language Models, or others. The sky is the limit!
  3. Returns a ranked list of targets that are most suitable for satisfying the query

At Pulze.ai, we use KNN-router to select the best LLM for user requests. Try it out locally here. Pulze Smart Router

Works with:

Usage

Quickstart

See this example for getting started locally.

Generating deployment artifacts

Dependencies:

  • points.jsonl: JSONL-formatted file containing points and their respective categories and embeddings. Each line should contain the following fields: point_uid, category, and embedding.
  • targets.jsonl: JSONL-formatted file containing the targets and their respective scores for each point. Each line should contain the following fields: point_uid, target, and score.

The following artifacts are required for deployment:

  • embeddings.snapshot: Snapshot of Qdrant collection containing the point embeddings
  • scores.db: Bolt DB containing the targets and their respective scores for each point

Use this script to generate these artifacts:

scripts/gen-artifacts.sh --points-data-path points.jsonl --scores-data-path targets.jsonl --output-dir ./dist

TODOs

  • Helm chart
  • GRPC endpoint