LLM RAG WEBAPPLICATION WITH LLAMAINDEX AND MISTRAL7

This project is motivated from blog post from llamaindex link

This RAG application will be running complete locally. I have used ollama(running llm model locally),llamaindex(RAG application), qdrant(vector db), flask(web api).

Install ollama for running the LLM on local machine, Then

Terminal-1 ollama run mistral #ollama run mixtral
Terminal-2 docker run -p 6333:6333 qdrant/qdrant
Terminal-3 python3 -m venv venv source venv/bin/activate pip install llama-index qdrant_client torch transformers flask flask-cors

Just to make sure the LLM is listening run test.py

from llama_index.llms import Ollama

llm = Ollama(model="mistral") response = llm.complete("Who is Laurie Voss?") print(response)

Then run the flask application

python app.py

Open another terminal and check the app

Terminal-4

curl --location "http://127.0.0.1:5000/process_form" --form 'query="What does the author -----?"'

Note:

Dataset should be in json format.
Due to RAM requirement of a hefty 48GB of RAM to run smoothly for Mixtral, I used Mistral 7b model.But can be run Mixtral with the same procedure if your machine can manage 48GB of RAM

SHAHIR123/LLM-RAG-Web-application

LLM RAG WEBAPPLICATION WITH LLAMAINDEX AND MISTRAL7