This repository contains an application designed to recommend scientific papers that are most similar to a given input paragraph. The application uses the llama
and weaviate
libraries to achieve this. For ease of deployment, a docker-compose.yml
file is provided to run Weaviate in a container since native installation on Windows posed challenges.
- Data Indexing: The application begins by reading scientific papers from a designated bucket and indexing them using Weaviate. The data is read using the
SimpleDirectoryReader
and parsed into nodes with theSimpleNodeParser
. - Vector Database Creation: Each node (paper or extracted text) is transformed into a vector using Weaviate's capabilities.
- Querying: On inputting a paper's paragraph, the application queries the vector database to get the top 3 most similar papers.
- Output Presentation: The titles and summaries of the top 3 papers are presented to the user.
- Docker
- Python 3.x
-
Clone the Repository:
git clone https://github.com/fshnkarimi/Similar-Paper-Reccomendation.git cd Similar-Paper-Reccomendation
-
Create a Virtual Environment:
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install Dependencies:
pip install -r requirements.txt
-
Run Weaviate with Docker: If you're on Windows or facing issues with Weaviate's native installation, the provided
docker-compose.yml
makes it easy to run Weaviate in a Docker container.docker-compose up -d
-
Start the Streamlit App:
streamlit run app.py
-
Visit the URL shown in the terminal to interact with the application.
-
Input a paragraph from a scientific paper and get recommendations!
If you'd rather see the step-by-step breakdown of the entire application along with the corresponding outputs, you can use the Jupyter Notebook:
-
Navigate to the
notebooks
directory:cd notebooks
-
Start Jupyter:
jupyter notebook
-
Open the provided notebook and execute the cells in sequence.