General usage Pattern of gpt-index
- Load in documents (either manually, or through a data loader).
- Index Construction.
- [Optional, Advanced] Building indices on top of other indices
- Query the index.
All scripts are split into two steps. One to create the index and another to run a query against it. The index generation should be run once and the query can be run multiple times. Be aware that the index generation is still dumb and will overwrite the index file every time. A more elaborate system should be in place to only update the index when necessary.
- Install poetry
- install
direnv
with this command:brew install direnv
and set it up do work with your shell - Copy
.envrc.example
to.envrc
and set the environment variables - Get an openAI API key and set it as an environment variable
OPENAI_API_KEY
in.envrc
- Create new Notion integration
- Get a Notion API key and set it as an environment variable
NOTION_INTEGRATION_TOKEN
in.envrc
- In the upper right corner of a Notion page click the three dots and add a connection to your integration. Every sub page will also be available to the integration.
- Get the Notion page ID and set add it to the script
notion/generate-index-from-pages.py
into thepage_ids
list - Make sure to always ignore the generated indices in your git repository e.g.
echo "notion/<YOUR INDEX NAME>.json" >> .gitignore
cd repo
# install dependencies
poetry install
# activate the virtual environment
poetry shell
# allow direnv to set environment variables in the shell
direnv allow
# write the index to disk
# choose one of the two scripts
python notion/generate-index-[from-database or from-pages].py
# run the query
# The script will prompt you for input
python notion/query-index.py
In the folder notion/utils you can find a script to generate a list of all the page or database ids your integration has access to. This is useful if you want to generate an index for all the pages in your Notion workspace.
poetry shell
direnv allow
python notion/utils/list-ids.py