RAG-Powered Web and PDF Content Extraction Application 📊📚

This Streamlit-based web application enables users to swiftly extract information from web pages and PDF files. By entering multiple URLs or uploading PDF documents, they can receive in-depth answers to specific queries. Powered by technologies such as LangChain and Google Generative AI, this tool facilitates efficient information retrieval from large datasets and documents. Enhanced with RAG (Retrieval-Augmented Generation) and Gemini models, the application offers enriched responses, saving users time and providing quick access to information 🚀. It's an ideal resource for researchers, students, and anyone in need of fast information access. Streamlit's user-friendly interface supports efficient application use, making it a valuable tool for comprehensive data analysis and inquiry 🔍💡.

Features

  • Multiple URL Inputs: Enter one or more URLs to extract information from web pages.
  • PDF Document Upload: Upload PDF files to retrieve specific information.
  • RAG and Gemini Integration: Utilizes advanced models for generating in-depth responses to queries.
  • Efficient Information Retrieval: Quickly access information from large datasets and documents.
  • User-Friendly Interface: Streamlit-based application for easy and efficient use.

Getting Started

To use this application, follow these steps:

  1. Clone the repository to your local machine.
  2. Install the required dependencies listed in requirements.txt.
  3. Run the Streamlit application by executing streamlit run your_app.py in your terminal.
  4. Navigate to the provided local URL in your web browser to start using the application.

License

This project is licensed under the MIT License - see the LICENSE file for details.