/cml_chatbot

Git Repo for CML Chatbot

Primary LanguagePython

CML PDF Document Chatbot

Git Repo for CML Document Chatbot

Key Features

  • Load PDF Documents directly from the App UI
  • Chunking and Cleaning of Documents
  • Model of Choice from HuggingFace
  • Gradio Chatbot Interface
  • Standalone Chroma VectorDB

Prerequistes

Features Under Development

  1. Conversational Memory (In Progress)
  2. Support for other VectorDBs (Milvus)

Resource Requirements

The AMP Application has been configured to use the following

  • 4 CPU
  • 32 GB RAM
  • 1 GPU

Steps to Setup the CML AMP Application

  1. Navigate to CML Workspace -> Site Administration -> AMPs Tab

  2. Under AMP Catalog Sources section, We will "Add a new source By" selecting "Catalog File URL"

  3. Provide the following URL and click "Add Source"

https://raw.githubusercontent.com/nkityd09/cml_speech_to_text/main/catalog.yaml
  1. Once added, We will be able to see the LLM PDF Document Chatbot in the AMP section and deploy it from there.

  2. During AMP deployment, we can set three environment varibales

    • VectorDB_IP:- (Specify the Public IP address of host where ChromaDB is running)
    • HF_MODEL:- HuggingFace Model Name (Defaults to mistralai/Mistral-7B-v0.1, please include the complete name of the model)
    • HF_TOKEN:- Provide HuggingFace Access Token for accessing Gated models like Falcon180B and Llama-2
  3. Click on the AMP and "Configure Project", disable Spark as it is not required.

  4. Once the AMP steps are completed, We can access the Gradio UI via the Applications page.

Note: The application creates a "default" collection in the VectorDB when the AMP is launched the first time.

Steps to Use the Gradio App

  1. Navigate to the "Upload File" Tab and use the "Click to Upload Button" to upload a file

Uploading Files

  1. Once the files have been uploaded, use the "Embed Document" button to store the document into VectorDB

Note Embedding documents is lenghty process and can take some time to complete.

Embedding Files

  1. Once Embedding has completed, switch to the FileGPT tab and enter your questions via the textbox and Submit button below.

Asking Questions

Video Demo

Demo.mov