Stat992

This blog brief introduces the technique of GCN.

Introduction

This paper considers the problem of classifying nodes in a graph where only a small part of nodes have labels. There are many types of graph convolutional networks and the GCN introduced in this blog belongs to the family of spectral graph convolutional networks.

Overview of this technique

On this figure, is the original features of nodes (Some nodes have labels and some nodes don't have labels). is the final prediction of nodes and is the ground-truth label for nodes (Not all nodes have labels). is the channel number of input layer and is the channel number of output layer.

The input of GCN is the original graph and the output is prediction of each node.

Math Formula

is the matrix of node features vectors, is the adjacency matrix and is the degree metrix.

Let which is the adjacency matrix with added self-connection. The paper uses the first order approximation of spectral decomposition to get a simple form of graph model (Please refer to the original paper for the detailed math derivation https://arxiv.org/pdf/1609.02907.pdf):

They calculate where is the degree matrix of .

The forward model for calculating the representation of nodes is , where is the input-to-hidden weight matrix and is the hidden-to-output weight matrix and they are trained using gradient descent.

The loss is computed from cross-entropy from the labeled nodes in the graph:

One example:

Dataset

Dataset Type Nodes Edges Classes Features Label Rate
Cora Citation Network 2708 5429 7 1433 0.052

The citation links are the undirected edges and the sparse bag-of-words feature vectors of each document are the nodes.

Experiments

The training log is shown here:

It could achieve a high accuracy with high speed.

References

https://arxiv.org/pdf/1609.02907.pdf

https://github.com/tkipf/pygcn