/Mutli-Modal-RAG-ChaBot

Building Essence Towards Personalized Knowledge Model - PKM

Primary LanguageJupyter NotebookMIT LicenseMIT

πŸ€– MULTI-MODAL RAG APPLICATION πŸ€–

Building Essence Toward's Personalized Knowledge Model (PKM)


Flowcharts (0)

1. Introduction

🌟 Welcome to the Personal Knowledge Model (PKM) Project! 🌟

πŸ” Discover the Future of Personalized Knowledge!
Our project draws inspiration from Microsoft's recent Recall initiative but takes a unique, user-centric approach. PKM uses advanced, on-device knowledge graph creation and updates, tailored weekly or monthly based on your interests. This ensures you can retrieve anything from your browsing history, search items, and mobile interactions, all at your fingertips!

πŸ›‘οΈ Privacy First!
Unlike Microsoft Recall, which may expose user data through LLM inference, our approach guarantees your privacy with on-device knowledge graph operations. Your data never leaves your device, ensuring a fortress of privacy around your personal information.

πŸ”§ What We're Building:
Our repository is dedicated to crafting a multi-modal RAG (Retrieval-Augmented Generation) using PDFs and YouTube videos. Think of this project as your gateway to creating a Personal Knowledge Model (PKM). While still in development, our PKM leverages your device's GPU to generate sophisticated knowledge graphs on the fly.

πŸš€ Current Focus:
For now, our eyes are set on multi-modal RAG applications using static data sources like PDFs and YouTube videos. The exciting part? Our RAG can retrieve relevant text, images, and video frames directly related to your queries!

πŸ‘€ Future Vision:
Microsoft recently unveiled GraphRAG, which currently supports CSV and TXT formats. We're aligning with this technology but expanding its capabilities to embrace PDF and video data, making our tool incredibly versatile.

🌐 Open Source Collaboration:
This repository is open for contributions! We're inviting developers and enthusiasts to join us in achieving the PKM vision. Whether you're interested in pushing the boundaries of machine learning or just passionate about privacy-centric technology, your input is invaluable.

πŸ’» Static Multi-Modal RAG Application:
Imagine interacting with a system that understands and retrieves information across multiple media types without needing the computational powerhouses typically required for dynamic data processing. That's what we're aiming forβ€”practical, accessible, and groundbreaking.


2. Basic Architectures for PKM, MMR-PDF (Multi-Modal RAG for PDF) & MMR-Video (Multi-Modal RAG for Video)

  • Basic PKM Architecture

Flowcharts (1)

  • MMR-PDF (Multi-Modal RAG for PDF) Architecture (Static)

Flowcharts (2)

  • MMR-Video (Multi-Modal RAG for Video) Architecture (Static)

Flowcharts (3)

3. Tech Stack

  • 🧠 MindsDB OpenAI Endpoint mdb.ai
  • 🧩 Langchain
  • πŸ–₯️ Streamlit
  • πŸ—‚οΈ FAISS

Here are the code block images for creating custom LLMs and embedding functions using LangChain's chain base class and MDB.ai endpoints. This is one of the trickiest parts of the project.

  • Custom LLM function Code:-
Screenshot 2024-07-09 at 8 00 06β€―PM
  • Custom Embedding function Code:-
Screenshot 2024-07-09 at 7 59 24β€―PM 1

4. Steps to Run the Project

Step 1: Clone the Repository

First, clone the repository to your local machine:

git clone https://github.com/chakka-guna-sekhar-venkata-chennaiah/Mutli-Modal-RAG-ChaBot.git

Step 2: Navigate to the Repository

Change into the repository directory:

cd Mutli-Modal-RAG-ChaBot

Step 3: Install Required Libraries

Install all the necessary libraries:

python -m pip install -r requirements.txt

Step 4: Create the Secrets File

Before running the Streamlit application, create a folder named .streamlit. Inside this folder, create a file named secrets.toml and add the following code:

api_key='your-minds-api-key'

Step 5: Initialize the User Interface

Launch the Streamlit application:

python -m streamlit run app.py

Important Note:

As mentioned earlier, this repository is built using the Monuments of National Importance PDF and a YouTube video. If you wish to use your own resources, please follow the steps in the available Colab notebooks and replace the FAISS index files accordingly. Feel free to customize app.py to suit your needs.

Our ultimate goal is to develop a Personalized Knowledge Model (PKM) and a dynamic multi-modal RAG system. This repository serves as a gateway to achieving that vision, and we welcome your support and contributions to help make it a reality.

4. Demo (Detailed Explanation)

For Quick Demo, Watch πŸ‘€

Screen.Recording.2024-07-09.at.4.mp4

Live WebApp Prototype

5. Future Enhancements

  • Dynamic Multi-Modal RAG Application
  • Integration with Real-Time on device data for creating advanced knoweldge graphs

⭐ If you find this repository useful, please star it! ⭐

Meet the Author

GitHub: chakka-guna-sekhar-venkata-chennaiah

LinkedIn: Chakka Guna Sekhar Venkata Chennaiah

X (formerly Twitter): @codevlogger

Thank you for being a part of our journey to create an advanced, personalized knowledge system! 🌟