AI-Powered Knowledge Base

YouTube Demo

This repository contains an AI-powered knowledge base that utilizes the LLMs model to answer questions based on a given website's content and provide sources as links to the relevant pages.

The system:

  1. Loads the website's content using a sitemap
  2. Split each web page into chunks
  3. Embed each chunk using a LLM (for now OpenAI) and store them in the **Chroma vector database
  4. Then it embeds the user query and run a similarity search using the Chroma database
  5. Finally it loads the similarity search results as context for a LLM (for now ChatGPT) to find relevant answers and citing the sources

It also provides a Streamlit-based web interface for an easy-to-use experience.


  • The main module that creates the KnowledgeBase class. This class is responsible for loading and processing the website content, creating the document index, and querying the LLM model for answers.
  • A Streamlit web application that provides a user interface for querying the AI-powered knowledge base.


  1. Clone the repository:
git clone
  1. Instal the project with poetry:
poetry install


Knowledge Base

To use the KnowledgeBase class, follow these steps:

  1. Import the KnowledgeBase class:
from knowledge_base import KnowledgeBase
  1. Instantiate the KnowledgeBase with the appropriate sitemap URL and pattern (optional):
kb = KnowledgeBase(
  1. Ask a question:
result = kb.ask("How do I deploy my Next.js app?")

Web Application

To run the Streamlit web application, execute the following command in your terminal:

streamlit run

The web app will open in your default browser. Enter the URL to the website's sitemap, an optional filter pattern for the URLs, and your question. The AI-powered knowledge base will return an answer based on the content of the website.



This project is licensed under the MIT License.