img src="https://raw.githubusercontent.com/hpe-design/logos/master/Requirements/color-logo.png" alt="HPE Logo" height="100"/>
Author: Andrew Mendez
Date: 05/01/2024
Revision: 0.1
This demonstration was built to showcase Retrieval Augmented Generation (RAG) on HPE press release documents . It shows how RAG can be used to assist customers with keeping up with recent HPE news articles. The information to answer the questions is sourced from the owners manuals which are publicly available.
To replicate this demo, you will need:
- A functioning Kubernetes cluster with loadbalancers for external facing services configured
- cluster having shared mounted folder
/nvmefs1/tyler.britten
- cluster having shared mounted folder
- Pachyderm/HPE MLDM 2.9.2 installed on the cluster and fully functional
- At least 1x NVIDIA A100 80GB GPUs
- Determined.AI/HPE MLDE environment for finetuning models (not included in the base code here)
NOTE: You might be able to replicate this demo with other GPUs (for example L40s) as well, but you need to consider the memory footprint of other GPUs and make adjustments accordingly.
[ToDo]
The below chart gives a logical overview of the application flow of Mercedes Me Chat.
- Step 1: Connect to deployed MLDM application
- Step 2: Create MLDM project named
rag-demo-hpe
- Step 3: Set new project as current context
- Step 4: Create repo
documents
to hold xml documents - Step 5: Upload xml documents
- Step 6: Pipeline step to parse documents
- Step 7: Pipeline step to chunk documents
- Step 8: Pipeline step to embed documents using embedding model
bge-large-en-v1.5
- Step 9: Pipeline step to deploy gui application
- Step 10: Interact with gui application
- Step 11: Add new documents to repo
documents
to improve RAG App - Step 12: (Optional Step) Pipeline step to develop dataset for finetuning embeddings: qna pipeline
- Step 13: (Optional Step) Pipeline step to fientune embeddings
- Step 14: Delete pipelines
RAG demo includes solution components from:
pachctl connect pachd-peer.pachyderm.svc.cluster.local:30653
pachctl create project rag-demo-hpe
pachctl config update context --project rag-demo-hpe
http://mldm-pachyderm.us.rdlabs.hpecorp.net/lineage/pdf-rag-andrew
pachctl create repo documents
pachctl put file documents@master: -f data/antonio-neri.xml
pachctl put file documents@master: -f data/aruba_wifi_7_press.xml
pachctl put file documents@master: -f data/e2e_ai_platform_press_release.xml
This pipeline takes the raw xml documents, and parses them into json format.
pachctl create pipeline -f pipelines/parsing.pipeline.json
This pipeline takes the parsed documents, and applies chunking to create chunked documents.
pachctl create pipeline -f pipelines/chunking.pipeline.json
This pipeline takes the chunked documents, and creates vector embeddings using the vector embedding bge-large-en-v1.5
pachctl create pipeline -f pipelines/embedding.pipeline.json
This pipeline will deploy a streamlit application for user to interact with the GUI
pachctl create pipeline -f pipelines/gui.pipeline.json
Note: There is an issue with the houston cluster where there are not enough IP addresses for service pipeline.
ssh andrew@mlds-mgmt.us.rdlabs.hpecorp.net -L 8080:localhost:8080
kubectl port-forward -n pachyderm svc/pdf-rag-andrew-gui-v1-user 8080:80
Who is Antonio Neri?
Who is Neil MacDonald?
We will show the key value proposition with a data driven pipeline, add more documents, the RAG app will automatically be updated.
pachctl put file documents@master: -f neil-macdonald.xml
When pipeline is done, refresh webpage and ask:
Who is Neil MacDonald?
pachctl create pipeline -f pipelines/qna.pipeline.json
Note, what is hardcoded is the following in fientune/experiment/const.yaml
:
name: arctic-embed-fine-tune
workspace: Tyler
project: doc_embeds
Also the bind_mounts are hardcoded. This assumes you are running on a cluster (i.e. the houston cluster) where you have a mounted shared folder called /nvmefs1
bind_mounts:
- container_path: /nvmefs1/
host_path: /nvmefs1/
propagation: rprivate
read_only: false
- container_path: /determined_shared_fs
host_path: /nvmefs1/determined/checkpoints
propagation: rprivate
read_only: false
pachctl create pipeline -f pipelines/finetune.pipeline.json
pachctl delete pipeline gui
pachctl delete pipeline finetune-embedding
pachctl delete pipeline generate-qna
pachctl delete pipeline embed-docs
pachctl delete pipeline chunk-doc
pachctl delete pipeline parse-docs
pachctl delete repo documents