dvmorozov/arxiv

Graph displaying article topics

Closed this issue · 0 comments

Task

Implement main graph displaying topics of articles and their relations.

  1. The graph should display set of topics (keywords) with relations. ✔️
  2. Size of node should correspond to number of articles related to the topic. ✔️
  3. #82.
  4. Mouse moving over the node should display pop-up window showing name of topic and number of articles. ✔️
  5. Edges of graph should connect topics related by articles. ✔️ Only most important links are displayed.
  6. #86.
  7. #85.

Solution

  1. Implement Python script extracting graph data from arxiv metadata. ✔️ Related #75.
  2. Filter-out data to show most important relations. ✔️
  3. Implement graph page and JavaScripts to load and visualize data. ✔️
  4. Use force-directed graph provided by d3.js. ✔️
  5. Use GitHub as hosting platform. Use JavaScript as data format. No back-end code. ✔️
  6. Add references to used components to the page. ✔️
  7. Add hyperlink opening graph in separate browser tab. ✔️

Dependencies

https://pypi.org/project/ijson/

Data

https://www.kaggle.com/datasets/Cornell-University/arxiv
https://arxiv.org/help/bulk_data

Commands

Extract downloaded data

gzip -d arxiv-public-datasets.gz

Tools

https://github.com/mattbierbaum/arxiv-public-datasets