Literature Review Bot
Setup
Docker and Qdrant
- Download and setup Docker and/or Docker Desktop
- Pull the Qdrant container and run it:
docker run -p 6333:6333 -v ./qdrant_storage:/qdrant/storage qdrant/qdrant
Download arXiv metadata from Kaggle
Or download my processed data directly (but it is out of date)
- https://www.kaggle.com/datasets/ltcmdrdata/arxiv-embeddings (I will not be maintaining this)
Process data
- Run
generate_embeddings.py
to fill upembeddings
folder (you may need to create this folder first) - Fire up Qdrant if its not already running
- Run
index_arxiv_metadata.py
to upload embeddings to Qdrant - Run
search_server.py
and go to http://127.0.0.1 to search for your articles - Download the PDFs you want from arXiv into th
PDFs
folder - Run
generate_literature_review.py
to create your final literature review