Retrieval Augmented Generation Chatbot using Langchain 🦜🔗 and HuggingFace 🤗
The concept of Retrieval Augmented Generation (RAG) involves leveraging pre-trained Large Language Models (LLM) alongside custom data to produce responses. This approach merges the capabilities of pre-trained dense retrieval and sequence-to-sequence models. In practice, RAG models first retrieve relevant documents, then feed them into a sequence-to-sequence model, and finally aggregate the results to generate outputs. By integrating these components, RAG enhances the generation process by incorporating both the comprehensive knowledge of pre-trained models and the specific context provided by custom data.
To get started, create a virtual environment and activate it:
virtualenv venv
source venv/bin/activate
Create a local environment file (.env
) and add your huggingface API key:
HF_TOKEN=your_huggingface_api_key
Next, install the required dependencies using pip:
pip install -r requirements.txt
Now, you can run the application:
gradio app.py
This will start the application, allowing you to chat with the RAG model.
Once the application is up and running, you can interact with the chatbot through a web interface.