- Clone the repository.
- Navigate to your repository directory: ‘cd your-repository’.
- Create a virtual environment: 'pipenv shell'.
- Install the required packages: 'pipenv install'.
- Set up environment variables:
Create a
.env
file in the root directory of your project and add your Pinecone API key, OpenAI API keyPINECONE_API_KEY= OPENAI_API_KEY=
- Fetch data from the MongoDB website: mkdir mongodb-docs wget -r -P mongodb-docs -E https://www.mongodb.com/docs/manual
- Pre-process the data by running the process_data.py script. You should see the following message if successful: Going to add xxx to Pinecone Loading to vectorstore done
- Start the app: streamlit run main.py.
The command wget -r -P mongodb-docs -E https://www.mongodb.com/docs/manual retrieves documents from MongoDB's documentation website, processes them, and stores them in a Pinecone Vector Store for efficient retrieval and embedding using OpenAI's embedding model.
- Loads documents from MongoDB documentation.
- Splits documents into smaller chunks for efficient processing.
- Updates document metadata with the correct source URLs.
- Adds processed documents to a Pinecone Vector Store.
- Python 3.x
- python-dotenv
- langchain
- langchain-community
- langchain-openai
- langchain-pinecone
- Pinecone account and API key
This Python script implements a Retrieval-Augmented Generation (RAG) model using LangChain, OpenAI, and Pinecone. The script retrieves relevant documents based on a query, incorporates chat history, and generates responses using OpenAI's language models.
- Embeds documents using OpenAI's embedding model.
- Retrieves documents from Pinecone Vector Store.
- Rephrases queries and performs retrieval-based question answering.
- Combines retrieved documents to generate a response.
- Python 3.x
- python-dotenv
- langchain
- langchain-openai
- langchain-pinecone
- Pinecone account and API key
- OpenAI API key
This function:
- Initializes OpenAI embeddings and Pinecone Vector Store.
- Sets up a chat model with OpenAI's language model.
- Pulls prompts for rephrasing queries and retrieval-based question answering.
- Creates a history-aware retriever and a retrieval chain.
- Invokes the retrieval chain with the input query and chat history.
- Returns the generated result.