/Similar-Paper-Reccomendation

This repository contains an application designed to recommend scientific papers that are most similar to a given input paragraph. The application uses the llama and weaviate libraries to achieve this.

Primary LanguageJupyter Notebook

Paper Similarity Search with Streamlit and Weaviate

This repository contains an application designed to recommend scientific papers that are most similar to a given input paragraph. The application uses the llama and weaviate libraries to achieve this. For ease of deployment, a docker-compose.yml file is provided to run Weaviate in a container since native installation on Windows posed challenges.

Table of Contents

Methodology

  1. Data Indexing: The application begins by reading scientific papers from a designated bucket and indexing them using Weaviate. The data is read using the SimpleDirectoryReader and parsed into nodes with the SimpleNodeParser.
  2. Vector Database Creation: Each node (paper or extracted text) is transformed into a vector using Weaviate's capabilities.
  3. Querying: On inputting a paper's paragraph, the application queries the vector database to get the top 3 most similar papers.
  4. Output Presentation: The titles and summaries of the top 3 papers are presented to the user.

Setup and Installation

Prerequisites

  • Docker
  • Python 3.x

Steps

  1. Clone the Repository:

    git clone https://github.com/fshnkarimi/Similar-Paper-Reccomendation.git
    cd Similar-Paper-Reccomendation
  2. Create a Virtual Environment:

    python -m venv venv
    source venv/bin/activate  # On Windows, use `venv\Scripts\activate`
  3. Install Dependencies:

    pip install -r requirements.txt
  4. Run Weaviate with Docker: If you're on Windows or facing issues with Weaviate's native installation, the provided docker-compose.yml makes it easy to run Weaviate in a Docker container.

    docker-compose up -d

Running the Application

  1. Start the Streamlit App:

    streamlit run app.py
  2. Visit the URL shown in the terminal to interact with the application.

  3. Input a paragraph from a scientific paper and get recommendations!

Notebook Approach

If you'd rather see the step-by-step breakdown of the entire application along with the corresponding outputs, you can use the Jupyter Notebook:

  1. Navigate to the notebooks directory:

    cd notebooks
  2. Start Jupyter:

    jupyter notebook
  3. Open the provided notebook and execute the cells in sequence.

Demo GIF