Relational-GCN (RGCN)

Applying Relational-GCN for heterogenous datasets like Amazon, IMDB, DBLP, ACM.
Paper: https://arxiv.org/abs/1703.06103
Popular datasets include Amazon, DBLP, IMDb and ACM.

Node Classification Task

python3 entity_classify.py -d [DATASET] -e [EPOCHS] --testing --gpu [CPU/ GPU]

where DATASET = {amazon, imdb, dplp, imdb}, gpu = 0(CPU), -1(GPU)

Example for running RGCN on Amazon data

python3 entity_classify.py -d amazon --testing --gpu 0

Dataset Statistics

ACM

author	paper	Subject	Paper-Author	Paper-Subject	Features	Train	Val	Test
5,912	3,025	57	9,936	3,025	1,902	600	300	2,125

IMDb

Movie Actor Director Movie-Actor Movie-Director Train Val Test

4,780 5,841 2,269 14,340 4,780 300 300 2,687

Movie	Actor	Director	Movie-Actor	Movie-Director	Train	Val	Test
4,780	5,841	2,269	14,340	4,780	300	300	2,687

DBLP

author	paper	Conf	Venue	Paper-Author	Paper-Conf	Paper-Term	Train	Val	Test
4,057	14,328	20	8,789	19,645	14,328	88,420	800	400	2,857

Fraud Amazon Dataset
The Amazon dataset includes product reviews under the Musical Instruments category. Users with more than 80% helpful votes are labelled as benign entities and users with less than 20% helpful votes are labelled as fraudulent entities. A fraudulent user detection task can be conducted on the Amazon dataset, which is a binary classification task. 25 handcrafted features from are taken as the raw node features .
Users are nodes in the graph, and three relations are: 1. U-P-U : it connects users reviewing at least one same product 2. U-S-U : it connects users having at least one same star rating within one week 3. U-V-U : it connects users with top 5% mutual review text similarities (measured by TF-IDF) among all users.

Nodes U-P-U U-S-U U-V-U Positive (fraudulent) Negative (benign) Unlabeled

11,944 351,216 7,132,958 2,073,474 821 7,818 3,305

surtantheta/Relational-GCN

Relational-GCN (RGCN)

Node Classification Task

Example for running RGCN on Amazon data

Dataset Statistics