Feature: Notion - Ingest & Vectorize
Opened this issue · 0 comments
polux0 commented
Source: https://rndadocs.notion.site/Notion-Ingest-Vectorize-f41e7f74e91047e88f6902764511e273
Tasks:
- Create a DAG
hivemind_notion_etl.py
- name:
notion_vector_store_update
- interval:
0 4 * * *
- name:
- Create a task
get_notion_communities
. This task should get all communities that have at least onenotion
platform associated. - Create a task
start_notion_vector_store(community_id: str)
.- Connect to mongodb and retrieve the Hivemind Module settings for a given community
- For that module get the platforms that are
notion
- Given the platform settings, retrieve the pages using the [NotionPageReader](https://docs.llamaindex.ai/en/stable/examples/data_connectors/NotionDemo/) (LlamaIndex reader)
- Define an IngestionPipeline to chunk, embed, and save data within PostgreSQL database
- Create appropriate test cases