Search https://docs.voxel51.com with an LLM!
This repo contains the code to enable semantic search on the Voxel51 documentation from Python or the command line. The search is powered by FiftyOne, OpenAI's text-embedding-ada-002 model, and Qdrant vector search.
- Clone the
fiftyone-docs-search
repo
git clone https://github.com/voxel51/fiftyone-docs-search
- Install the
fiftyone-docs-search
package bycd
ing into the repo and running:
pip install -e .
- register an API key. Once you have your API key, set the
OPENAI_API_KEY
environment variable to it:
export OPENAI_API_KEY=<your key>
- set up a Docker container with Qdrant running locally:
docker pull qdrant/qdrant
docker run -d -p 6333:6333 qdrant/qdrant
The fiftyone-docs-search
package provides a command line interface for
searching the Voxel51 documentation. To use it, run:
fiftyone-docs-search query <query>
where <query>
is the search query. For example:
fiftyone-docs-search query "how to load a dataset"
The following flags can give you control over the search behavior:
--num_results
: the number of results returned--open_url
: whether to open the top result in your browser--score
: whether to return the score of each result--doc_types
: the types of docs to search over (e.g., "tutorials", "api", "guides")--block_types
: the types of blocks to search over (e.g., "code", "text")
You can also use the --help
flag to see all available options:
fiftyone-docs-search --help
If you find fiftyone-docs-search query
cumbersome, you can alias the command, by adding the following to your ~/.bashrc
or ~/.zshrc
file:
alias fosearch='fiftyone-docs-search query'
The fiftyone-docs-search
package also provides a Python API for searching the
Voxel51 documentation. To use it, run:
from fiftyone.docs_search import FiftyOneDocsSearch
fods = FiftyOneDocsSearch()
results = fods("how to load a dataset")
You can set defaults for the search behavior by passing arguments to the constructor:
fods = FiftyOneDocsSearch(
num_results=5,
open_url=True,
score=True,
doc_types=["tutorials", "api", "guides"],
block_types=["code", "text"],
)
For any individual search, you can override these defaults by passing arguments.
The fiftyone-docs-search
package is versioned to match the version of the
Voxel51 FiftyOne documentation that it is searching. For example, the v0.20.1
version of the fiftyone-docs-search
package is designed to search the
v0.20.1
version of the Voxel51 FiftyOne documentation.
By default, if you do not have a Qdrant collection instantiated yet, when you
run a search, the fiftyone-docs-search
package will automatically download
a JSON file containing a vector indexing of the latest version of the Voxel51
FiftyOne documentation.
If you would like, you can also build the index yourself from a local copy of the Voxel51 FiftyOne documentation. To do so, first clone the FiftyOne repo if you haven't already:
git clone https://github.com/voxel51/fiftyone
and install FiftyOne, as described in the detailed installation instructions here.
Build a local version of the docs by running:
bash docs/generate_docs.bash
Then, set a FIFTYONE_DIR
environment variable to the path to the local FiftyOne repo. For example, if you cloned the repo to ~/fiftyone
, you would run:
export FIFTYONE_DIR=~/fiftyone
Finally, run the following command to build the index:
fiftyone-docs-search create
If you would like to save the Qdrant index to JSON, you can run:
fiftyone-docs-search save -o <path to JSON file>
We welcome contributions to this repo!
If you've made it this far, we'd greatly appreciate if you'd take a moment to check out our main repo, FiftyOne, and give that project a star. Thanks so much!