/GraphGallery

GraphGallery is a gallery of state-of-the-art Graph Neural Networks (GNNs) for TensorFlow 2.x and PyTorch.

Primary LanguagePythonMIT LicenseMIT

logo

TensorFLow or PyTorch? Both!

Python tensorflow pytorch pypi stars forks issues pypi

GraphGallery

GraphGallery is a gallery of state-of-the-arts graph neural networks for TensorFlow 2.x and PyTorch. GraphGallery 0.4.x is a total re-write from previous versions, and some things have changed.

๐Ÿ‘€ What's important

Difference between GraphGallery and Pytorch Geometric (PyG), Deep Graph Library (DGL), etc...

  • PyG and DGL are just like TensorFlow while GraphGallery is more like Keras
  • GraphGallery is more friendly to use
  • GraphGallery is more efficiient

๐Ÿš€ Installation

  • Build from source (latest version)
git clone https://github.com/EdisonLeeeee/GraphGallery.git
cd GraphGallery
python setup.py install
  • Or using pip (stable version)
pip install -U graphgallery

๐Ÿค– Implementations

In detail, the following methods are currently implemented:

Semi-supervised models

General models

Defense models

Unsupervised models

โšก Quick Start

Datasets

more details please refer to GraphData.

Planetoid

fixed datasets

from graphgallery.data import Planetoid
# set `verbose=False` to avoid additional outputs 
data = Planetoid('cora', verbose=False)
graph = data.graph
idx_train, idx_val, idx_test = data.split()
# idx_train:  training indices: 1D Numpy array
# idx_val:  validation indices: 1D Numpy array
# idx_test:  testing indices: 1D Numpy array
>>> graph
Graph(adj_matrix(2708, 2708), attr_matrix(2708, 2708), labels(2708,))

currently the supported datasets are:

>>> data.supported_datasets
('citeseer', 'cora', 'pubmed')

NPZDataset

more scalable datasets (stored with .npz)

from graphgallery.data import NPZDataset;
# set `verbose=False` to avoid additional outputs
data = NPZDataset('cora', verbose=False, standardize=False)
graph = data.graph
idx_train, idx_val, idx_test = data.split(random_state=42)
>>> graph
Graph(adj_matrix(2708, 2708), attr_matrix(2708, 2708), labels(2708,))

currently the supported datasets are:

>>> data.supported_datasets
('citeseer', 'citeseer_full', 'cora', 'cora_ml', 'cora_full', 
 'amazon_cs', 'amazon_photo', 'coauthor_cs', 'coauthor_phy', 
 'polblogs', 'pubmed', 'flickr', 'blogcatalog')

Tensor

  • Strided (dense) Tensor
>>> backend()
TensorFlow 2.1.2 Backend

>>> from graphgallery import transforms as T
>>> arr = [1, 2, 3]
>>> T.astensor(arr)
<tf.Tensor: shape=(3,), dtype=int32, numpy=array([1, 2, 3], dtype=int32)>
  • Sparse Tensor
>>> import scipy.sparse as sp
>>> sp_matrix = sp.eye(3)
>>> T.astensor(sp_matrix)
<tensorflow.python.framework.sparse_tensor.SparseTensor at 0x7f1bbc205dd8>
  • also works for PyTorch, just like
>>> from graphgallery import set_backend
>>> set_backend('torch') # torch, pytorch or th
PyTorch 1.6.0+cu101 Backend

>>> T.astensor(arr)
tensor([1, 2, 3])

>>> T.astensor(sp_matrix)
tensor(indices=tensor([[0, 1, 2],
                       [0, 1, 2]]),
       values=tensor([1., 1., 1.]),
       size=(3, 3), nnz=3, layout=torch.sparse_coo)
  • To Numpy or Scipy sparse matrix
>>> tensor = T.astensor(arr)
>>> T.tensoras(tensor)
array([1, 2, 3])

>>> sp_tensor = T.astensor(sp_matrix)
>>> T.tensoras(sp_tensor)
<3x3 sparse matrix of type '<class 'numpy.float32'>'
    with 3 stored elements in Compressed Sparse Row format>
  • Or even convert one Tensor to another one
>>> tensor = T.astensor(arr, kind="T")
>>> tensor
<tf.Tensor: shape=(3,), dtype=int64, numpy=array([1, 2, 3])>
>>> T.tensor2tensor(tensor)
tensor([1, 2, 3])

>>> sp_tensor = T.astensor(sp_matrix, kind="T") # set kind="T" to convert to tensorflow tensor
>>> sp_tensor
<tensorflow.python.framework.sparse_tensor.SparseTensor at 0x7efb6836a898>
>>> T.tensor2tensor(sp_tensor)
tensor(indices=tensor([[0, 1, 2],
                       [0, 1, 2]]),
       values=tensor([1., 1., 1.]),
       size=(3, 3), nnz=3, layout=torch.sparse_coo)

Example of GCN model

from graphgallery.nn.models import GCN

model = GCN(graph, attr_transform="normalize_attr", device="CPU", seed=123)
# build your GCN model with default hyper-parameters
model.build()
# train your model. here idx_train and idx_val are numpy arrays
# verbose takes 0, 1, 2, 3, 4
his = model.train(idx_train, idx_val, verbose=1, epochs=100)
# test your model
# verbose takes 0, 1, 2
loss, accuracy = model.test(idx_test, verbose=1)
print(f'Test loss {loss:.5}, Test accuracy {accuracy:.2%}')

On Cora dataset:

Training...
100/100 [==============================] - 1s 14ms/step - loss: 1.0161 - acc: 0.9500 - val_loss: 1.4101 - val_acc: 0.7740 - time: 1.4180
Testing...
1/1 [==============================] - 0s 62ms/step - test_loss: 1.4123 - test_acc: 0.8120 - time: 0.0620
Test loss 1.4123, Test accuracy 81.20%

Customization

  • Build your model you can use the following statement to build your model
# one hidden layer with hidden units 32 and activation function RELU
>>> model.build(hiddens=32, activations='relu')

# two hidden layer with hidden units 32, 64 and all activation functions are RELU
>>> model.build(hiddens=[32, 64], activations='relu')

# two hidden layer with hidden units 32, 64 and activation functions RELU and ELU
>>> model.build(hiddens=[32, 64], activations=['relu', 'elu'])
  • Train your model
# train with validation
>>> his = model.train(idx_train, idx_val, verbose=1, epochs=100)
# train without validation
>>> his = model.train(idx_train, verbose=1, epochs=100)

here his is a tensorflow History instance.

  • Test you model
>>> loss, accuracy = model.test(idx_test, verbose=1)
Testing...
1/1 [==============================] - 0s 62ms/step - test_loss: 1.4123 - test_acc: 0.8120 - time: 0.0620
>>> print(f'Test loss {loss:.5}, Test accuracy {accuracy:.2%}')
Test loss 1.4123, Test accuracy 81.20%

Visualization

NOTE: you must install SciencePlots package for a better preview.

import matplotlib.pyplot as plt
with plt.style.context(['science', 'no-latex']):
    fig, axes = plt.subplots(1, 2, figsize=(15, 5))
    axes[0].plot(his.history['acc'], label='Train accuracy', linewidth=3)
    axes[0].plot(his.history['val_acc'], label='Val accuracy', linewidth=3)
    axes[0].legend(fontsize=20)
    axes[0].set_title('Accuracy', fontsize=20)
    axes[0].set_xlabel('Epochs', fontsize=20)
    axes[0].set_ylabel('Accuracy', fontsize=20)

    axes[1].plot(his.history['loss'], label='Training loss', linewidth=3)
    axes[1].plot(his.history['val_loss'], label='Validation loss', linewidth=3)
    axes[1].legend(fontsize=20)
    axes[1].set_title('Loss', fontsize=20)
    axes[1].set_xlabel('Epochs', fontsize=20)
    axes[1].set_ylabel('Loss', fontsize=20)
    
    plt.autoscale(tight=True)
    plt.show()        

visualization

Using TensorFlow/PyTorch Backend

>>> import graphgallery
>>> graphgallery.backend()
TensorFlow 2.1.0 Backend

>>> graphgallery.set_backend("pytorch")
PyTorch 1.6.0+cu101 Backend

GCN using PyTorch backend

# The following codes are the same with TensorFlow Backend
>>> from graphgallery.nn.models import GCN
>>> model = GCN(graph, attr_transform="normalize_attr", device="GPU", seed=123);
>>> model.build()
>>> his = model.train(idx_train, idx_val, verbose=1, epochs=100)
Training...
100/100 [==============================] - 0s 5ms/step - loss: 0.6813 - acc: 0.9214 - val_loss: 1.0506 - val_acc: 0.7820 - time: 0.4734
>>> loss, accuracy = model.test(idx_test, verbose=1)
Testing...
1/1 [==============================] - 0s 1ms/step - test_loss: 1.0131 - test_acc: 0.8220 - time: 0.0013
>>> print(f'Test loss {loss:.5}, Test accuracy {accuracy:.2%}')
Test loss 1.0131, Test accuracy 82.20%

โ“ How to add your datasets

This is motivated by gnn-benchmark

from graphgallery.data import Graph

# Load the adjacency matrix A, attribute matrix X and labels vector y
# A - scipy.sparse.csr_matrix of shape [n_nodes, n_nodes]
# X - scipy.sparse.csr_matrix or np.ndarray of shape [n_nodes, n_atts]
# y - np.ndarray of shape [n_nodes]

mydataset = Graph(adj_matrix=A, attr_matrix=X, labels=y)
# save dataset
mydataset.to_npz('path/to/mydataset.npz')
# load dataset
mydataset = Graph.from_npz('path/to/mydataset.npz')

โ“ How to define your models

TODO

๐Ÿ˜Ž More Examples

Please refer to the examples directory.

โญ TODO List

  • Add PyTorch models support
  • Add more GNN models (TF and Torch backend)
  • Support for more tasks, e.g., graph Classification and link prediction
  • Support for more types of graphs, e.g., Heterogeneous graph
  • Add Docstrings and Documentation (Building)

๐Ÿ˜˜ Acknowledgement

This project is motivated by Pytorch Geometric, Tensorflow Geometric and Stellargraph, and the original implementations of the authors, thanks for their excellent works!