/graph-cnn.pytorch

Pytorch Implementation for Graph Convolutional Neural Networks

Primary LanguagePythonMIT LicenseMIT

Pytorch implementation of Graph Convolution Networks & Graph Attention Convolutional Networks.

This project is made by Bumsoo Kim, Ph.D Candidate in Korea University. This repo has been forked from https://github.com/tkipf/pygcn.

Graph Convolutional Networks

Many important real-world datasets come in the form of graphs or networks: social networks, knowledge graphs, protein-interaction networks, the World Wide Web, etc. In this repository, we introduce a basic tutorial for generalizing neural netowrks to work on arbitrarily structured graphs, along with Graph Attention Convolutional Networks(Attention GCN).

Currently, most graph neural network models have a somewhat universal architecture in common. They are referred as Graph Convoutional Networks(GCNs) since filter parameters are typically shared over all locations in the graph.

For these models, the goal is to learn a function of signals/features on a graph G=(V, E), which takes as

Input

  • N x D feature matrix (N : Number of nodes, D : number of input features)
  • representative description of the graph structure in matrix form; typically in the form of adjacency matrix A

Output

  • N x F feature matrix (N : Number of nodes, F : number of output features)

Graph-level outputs can be modeled by introducing some form of pooling operation.

Every neural network layer can then be written as a non-linear function

with H(0) and H(L), where L is the number of layers. The specific models then differ only in how function f is chosen and parameterized.

In this repo, the layer-wise propagation is consisted as

As the activation function is a non-linear ReLU (Rectified Linear Unit), this becomes

Implementation detail #1 :

Multiplication with A means that, for every node, we sum up all the feature vectors of all neighboring nodes but not the node itself. To address this, we add the identity matrix to A.

Implementation detail #2 :

A is typically not normalized and therfore the multiplication and therefore the multiplication with A will completely change the scale of the feature vectors. Normalizing A such that all rows sum to one, i.e. row sum.

Final Implementation :

Combining the two implementation details above gives us a final propagation rule introduced in Kipf & Welling (ICLR 2017).

For more details, see here.

Requirements

See the installation instruction for a step-by-step installation guide. See the server instruction for server settup.

pip install http://download.pytorch.org/whl/cu80/torch-0.1.12.post2-cp27-none-linux_x86_64.whl
pip install torchvision
git clone https://github.com/meliketoy/graph-cnn.pytorch
pip install networkx

Planetoid Dataset

In this repo, we use an implementation of Planetoid, a graph-based sem-supervised learning method proposed in the following paper: Revisiting Semi-Supervised Learning with Graph Embeddings.

This dataset is consisted of 3 sub-datasets ('pubmed', 'cora', 'citeseer')

Each node in the dataset represents a document, and the edge represents the 'reference' relationship between the documents.

The data

Transductive learning

  • x : the feature vectors of the training instances
  • y : the one-hot labels of the training instances
  • graph : {index: [index of neighber nodes]}, where the neighbor nodes are given as a list.

Inductive learning

  • x : the feature vectors of the labeled training instances
  • y : the one-hot labels of the training instances
  • allx : the feature vectors of both labeled and unlabeled training instances.
  • graph : {index: [index of neighber nodes]}, where the neighbor nodes are given as a list.

For more details, see here

Train network

After you have cloned the repository, you can train the dataset by running the script below.

Download the planetoid datset above and give the [:dir to dataset] the directory to the downloaded datset.

python train.py --dataroot [:dir to dataset] --datset [:cora | citeseer | pubmed] --model [:basic|drop_in]

Test (Inference) various networks

After you have finished training, you can test out your network by

python test.py --dataroot [:dir to dataset] --dataset [:cora | citeseer | pubmed] --model [:basic|drop_in]

Enjoy :-)