graph-dive: A Jupyter Notebook repository from hw-ch0

📕 Predict a publication trend of AI journals / conferences using GNNs
Baseline paper: Structured Citation Trend Prediction Using Graph Neural Network

Members

👑차지수
윤수진
조현우
진현빈
박수빈
김산
김민서

Requirements

Verisions (Recommended)

Python 3.7.x
Pytorch 1.12.1+cu113
Torch_geometric 2.1.0

Docker

We recommend using our Dockerfile to get started easily

## build docker image
$ docker build -t graph-dive:latest . 

## execute docker container
$ docker run --name graph-dive --ipc=host -it -v <working_dir>:/workspace -w /workspace graph-dive:latest /bin/bash

Model

We follow the architecture of baseline paper which is based on GATs and GCNs.
[Training stage]

[Prediction stage]

Dataset

MAG(Microsoft Academic Graph)

We use author, affiliation, the number of citation, title and abstract of paper, year as raw inputs. Please check this webpage for more information.

Data directory tree

Directory tree including data should be as follows:

├─graph-dive/
└─data/
	├─ affiliationembedding.csv
	├─ edge_data/
	│   ├─ 1158167855_refs.csv #{CVPR_conference_id}_refs.csv
	│   ├─ 1184914352_refs.csv #{AAAI_conference_id}_refs.csv
	│   └─ ...
	├─ year_data/
	│   ├─ 1158167855.csv #{CVPR_conference_id}.csv
	│   ├─ 1184914352.csv #{AAAI_conference_id}.csv 
	│   └─ ...
	├─ json_1158167855/ # CVPR
	│   ├─ {paper_id1}.json
	│   ├─ {paper_id2}.json
	│   └─ ...
	├─ json_1184914352/ # AAAI
	│   └─ ...
	...

For each journal/conference, conference IDs are look like:

Conference	Conference ID	# of nodes
ICML	1180662882	8653
ICASSP	1121227772	16997
NeurIPS	1127325140	8113
AAAI	1184914352	13766
EMNLP	1192655580	5667
CVPR	1158167855	13058
ICDM	1183478919	4169
CIKM	1194094125	4201

Run

Command examples

# CVPR
$ bash scripts/run_CVPR.sh

# ICASSP
$ bash scripts/run_ICASSP.sh

Note that the number of valid data are smaller than the values stated above due to insufficient sources(OpenAlex API, MAG dataset, etc..)

📝 SKILLS

Frameworks:

hw-ch0/graph-dive