Inquiry is a search engine of electronic preprints.
For now, you can access to papers in the fields of mathematics, physics, computer science and statistics. These papers are retrieved from the arXiv repository.
Prerequisites
In order to use Inquiry, we assume you have met the following requirements:
- python>=3.x
- elasticsearch==7.4.0
- gcc
- Poppler cpp lib
Installing Inquiry
To install Inquiry, follow these steps:
-
Clone this repository:
$ git clone https://github.com/javiergarea/inquiry.git
-
Run the following command to install the project dependencies:
$ pip3 install -r requirements.txt
If something goes wrong during this step, ensure you have installed
pip
,gcc
andpopplerlib
.
Running Inquiry
-
Run the arXiv spider in order to crawl the documents:
$ scrapy crawl arxiv
This should generate an
items.jsonl
file in the root directory. -
Start the Elasticsearch service:
$ elasticsearch
Check that is running properly by running the command
curl localhost:9200
. -
Index the crawled data in Elasticsearch:
$ python3 elastic_manage.py -i items.jsonl
-
Run the Inquiry service:
$ python3 manage.py runserver
-
Access to localhost:8000 and perform your queries.
Documentation
Inquiry is an Information Retrieval project. This project has been developed as part of the MSc. in Computer Science at Universidade da Coruña. The software is accompanied by a technical document which details its development. This document is available in web version.
Authors
Javier Garea - javier.garea@udc.es
Martín Sande - martin.sande@udc.es
License
This project uses the following license: MIT.