TensorFLow or PyTorch? Both!
GraphGallery is a gallery of state-of-the-arts graph neural networks for TensorFlow 2.x and PyTorch. GraphGallery 0.4.x is a total re-write from previous versions, and some things have changed.
Difference between GraphGallery and Pytorch Geometric (PyG), Deep Graph Library (DGL), etc...
- PyG and DGL are just like TensorFlow while GraphGallery is more like Keras
- GraphGallery is more friendly to use
- GraphGallery is more efficiient
- Build from source (latest version)
git clone https://github.com/EdisonLeeeee/GraphGallery.git
cd GraphGallery
python setup.py install
- Or using pip (stable version)
pip install -U graphgallery
In detail, the following methods are currently implemented:
- ChebyNet from Michaรซl Defferrard et al, ๐Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering, NIPS'16. [๐ TF]
- GCN from Thomas N. Kipf et al, ๐Semi-Supervised Classification with Graph Convolutional Networks, ICLR'17. [๐ TF], [๐ฅ Torch]
- GraphSAGE from William L. Hamilton et al, ๐Inductive Representation Learning on Large Graphs, NIPS'17. [๐ TF]
- FastGCN from Jie Chen et al, FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling, ICLR'18. [๐ TF]
- LGCN from Hongyang Gao et al, ๐Large-Scale Learnable Graph Convolutional Networks, KDD'18. [๐ TF]
- GAT from Petar Veliฤkoviฤ et al, ๐Graph Attention Networks, ICLR'18. ), [๐ TF], [๐ฅ Torch]
- SGC from Felix Wu et al, ๐Simplifying Graph Convolutional Networks, ICML'19. [๐ TF], [๐ฅ Torch]
- GWNN from Bingbing Xu et al, ๐Graph Wavelet Neural Network, ICLR'19. [๐ TF]
- GMNN from Meng Qu et al, ๐Graph Markov Neural Networks, ICML'19. [๐ TF]
- ClusterGCN from Wei-Lin Chiang et al, ๐Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks, KDD'19. [๐ TF], [๐ฅ Torch]
- DAGNN from Meng Liu et al, ๐Towards Deeper Graph Neural Networks, KDD'20. [๐ TF]
- RobustGCN from Dingyuan Zhu et al, ๐Robust Graph Convolutional Networks Against Adversarial Attacks, KDD'19. [๐ TF]
- SBVAT from Zhijie Deng et al, ๐Batch Virtual Adversarial Training for Graph Convolutional Networks, ICML'19. [๐ TF]
- OBVAT from Zhijie Deng et al, ๐Batch Virtual Adversarial Training for Graph Convolutional Networks, ICML'19. [๐ TF]
- Deepwalk from Bryan Perozzi et al, ๐DeepWalk: Online Learning of Social Representations, KDD'14. [๐ TF]
- Node2vec from Aditya Grover et al, ๐node2vec: Scalable Feature Learning for Networks, KDD'16. [๐ TF]
more details please refer to GraphData.
fixed datasets
from graphgallery.data import Planetoid
# set `verbose=False` to avoid additional outputs
data = Planetoid('cora', verbose=False)
graph = data.graph
idx_train, idx_val, idx_test = data.split()
# idx_train: training indices: 1D Numpy array
# idx_val: validation indices: 1D Numpy array
# idx_test: testing indices: 1D Numpy array
>>> graph
Graph(adj_matrix(2708, 2708), attr_matrix(2708, 2708), labels(2708,))
currently the supported datasets are:
>>> data.supported_datasets
('citeseer', 'cora', 'pubmed')
more scalable datasets (stored with .npz
)
from graphgallery.data import NPZDataset;
# set `verbose=False` to avoid additional outputs
data = NPZDataset('cora', verbose=False, standardize=False)
graph = data.graph
idx_train, idx_val, idx_test = data.split(random_state=42)
>>> graph
Graph(adj_matrix(2708, 2708), attr_matrix(2708, 2708), labels(2708,))
currently the supported datasets are:
>>> data.supported_datasets
('citeseer', 'citeseer_full', 'cora', 'cora_ml', 'cora_full',
'amazon_cs', 'amazon_photo', 'coauthor_cs', 'coauthor_phy',
'polblogs', 'pubmed', 'flickr', 'blogcatalog')
- Strided (dense) Tensor
>>> backend()
TensorFlow 2.1.2 Backend
>>> from graphgallery import transforms as T
>>> arr = [1, 2, 3]
>>> T.astensor(arr)
<tf.Tensor: shape=(3,), dtype=int32, numpy=array([1, 2, 3], dtype=int32)>
- Sparse Tensor
>>> import scipy.sparse as sp
>>> sp_matrix = sp.eye(3)
>>> T.astensor(sp_matrix)
<tensorflow.python.framework.sparse_tensor.SparseTensor at 0x7f1bbc205dd8>
- also works for PyTorch, just like
>>> from graphgallery import set_backend
>>> set_backend('torch') # torch, pytorch or th
PyTorch 1.6.0+cu101 Backend
>>> T.astensor(arr)
tensor([1, 2, 3])
>>> T.astensor(sp_matrix)
tensor(indices=tensor([[0, 1, 2],
[0, 1, 2]]),
values=tensor([1., 1., 1.]),
size=(3, 3), nnz=3, layout=torch.sparse_coo)
- To Numpy or Scipy sparse matrix
>>> tensor = T.astensor(arr)
>>> T.tensoras(tensor)
array([1, 2, 3])
>>> sp_tensor = T.astensor(sp_matrix)
>>> T.tensoras(sp_tensor)
<3x3 sparse matrix of type '<class 'numpy.float32'>'
with 3 stored elements in Compressed Sparse Row format>
- Or even convert one Tensor to another one
>>> tensor = T.astensor(arr, kind="T")
>>> tensor
<tf.Tensor: shape=(3,), dtype=int64, numpy=array([1, 2, 3])>
>>> T.tensor2tensor(tensor)
tensor([1, 2, 3])
>>> sp_tensor = T.astensor(sp_matrix, kind="T") # set kind="T" to convert to tensorflow tensor
>>> sp_tensor
<tensorflow.python.framework.sparse_tensor.SparseTensor at 0x7efb6836a898>
>>> T.tensor2tensor(sp_tensor)
tensor(indices=tensor([[0, 1, 2],
[0, 1, 2]]),
values=tensor([1., 1., 1.]),
size=(3, 3), nnz=3, layout=torch.sparse_coo)
from graphgallery.nn.models import GCN
model = GCN(graph, attr_transform="normalize_attr", device="CPU", seed=123)
# build your GCN model with default hyper-parameters
model.build()
# train your model. here idx_train and idx_val are numpy arrays
# verbose takes 0, 1, 2, 3, 4
his = model.train(idx_train, idx_val, verbose=1, epochs=100)
# test your model
# verbose takes 0, 1, 2
loss, accuracy = model.test(idx_test, verbose=1)
print(f'Test loss {loss:.5}, Test accuracy {accuracy:.2%}')
On Cora
dataset:
Training...
100/100 [==============================] - 1s 14ms/step - loss: 1.0161 - acc: 0.9500 - val_loss: 1.4101 - val_acc: 0.7740 - time: 1.4180
Testing...
1/1 [==============================] - 0s 62ms/step - test_loss: 1.4123 - test_acc: 0.8120 - time: 0.0620
Test loss 1.4123, Test accuracy 81.20%
- Build your model you can use the following statement to build your model
# one hidden layer with hidden units 32 and activation function RELU
>>> model.build(hiddens=32, activations='relu')
# two hidden layer with hidden units 32, 64 and all activation functions are RELU
>>> model.build(hiddens=[32, 64], activations='relu')
# two hidden layer with hidden units 32, 64 and activation functions RELU and ELU
>>> model.build(hiddens=[32, 64], activations=['relu', 'elu'])
- Train your model
# train with validation
>>> his = model.train(idx_train, idx_val, verbose=1, epochs=100)
# train without validation
>>> his = model.train(idx_train, verbose=1, epochs=100)
here his
is a tensorflow History
instance.
- Test you model
>>> loss, accuracy = model.test(idx_test, verbose=1)
Testing...
1/1 [==============================] - 0s 62ms/step - test_loss: 1.4123 - test_acc: 0.8120 - time: 0.0620
>>> print(f'Test loss {loss:.5}, Test accuracy {accuracy:.2%}')
Test loss 1.4123, Test accuracy 81.20%
NOTE: you must install SciencePlots package for a better preview.
import matplotlib.pyplot as plt
with plt.style.context(['science', 'no-latex']):
fig, axes = plt.subplots(1, 2, figsize=(15, 5))
axes[0].plot(his.history['acc'], label='Train accuracy', linewidth=3)
axes[0].plot(his.history['val_acc'], label='Val accuracy', linewidth=3)
axes[0].legend(fontsize=20)
axes[0].set_title('Accuracy', fontsize=20)
axes[0].set_xlabel('Epochs', fontsize=20)
axes[0].set_ylabel('Accuracy', fontsize=20)
axes[1].plot(his.history['loss'], label='Training loss', linewidth=3)
axes[1].plot(his.history['val_loss'], label='Validation loss', linewidth=3)
axes[1].legend(fontsize=20)
axes[1].set_title('Loss', fontsize=20)
axes[1].set_xlabel('Epochs', fontsize=20)
axes[1].set_ylabel('Loss', fontsize=20)
plt.autoscale(tight=True)
plt.show()
>>> import graphgallery
>>> graphgallery.backend()
TensorFlow 2.1.0 Backend
>>> graphgallery.set_backend("pytorch")
PyTorch 1.6.0+cu101 Backend
GCN using PyTorch backend
# The following codes are the same with TensorFlow Backend
>>> from graphgallery.nn.models import GCN
>>> model = GCN(graph, attr_transform="normalize_attr", device="GPU", seed=123);
>>> model.build()
>>> his = model.train(idx_train, idx_val, verbose=1, epochs=100)
Training...
100/100 [==============================] - 0s 5ms/step - loss: 0.6813 - acc: 0.9214 - val_loss: 1.0506 - val_acc: 0.7820 - time: 0.4734
>>> loss, accuracy = model.test(idx_test, verbose=1)
Testing...
1/1 [==============================] - 0s 1ms/step - test_loss: 1.0131 - test_acc: 0.8220 - time: 0.0013
>>> print(f'Test loss {loss:.5}, Test accuracy {accuracy:.2%}')
Test loss 1.0131, Test accuracy 82.20%
This is motivated by gnn-benchmark
from graphgallery.data import Graph
# Load the adjacency matrix A, attribute matrix X and labels vector y
# A - scipy.sparse.csr_matrix of shape [n_nodes, n_nodes]
# X - scipy.sparse.csr_matrix or np.ndarray of shape [n_nodes, n_atts]
# y - np.ndarray of shape [n_nodes]
mydataset = Graph(adj_matrix=A, attr_matrix=X, labels=y)
# save dataset
mydataset.to_npz('path/to/mydataset.npz')
# load dataset
mydataset = Graph.from_npz('path/to/mydataset.npz')
TODO
Please refer to the examples directory.
- Add PyTorch models support
- Add more GNN models (TF and Torch backend)
- Support for more tasks, e.g.,
graph Classification
andlink prediction
- Support for more types of graphs, e.g., Heterogeneous graph
- Add Docstrings and Documentation (Building)
This project is motivated by Pytorch Geometric, Tensorflow Geometric and Stellargraph, and the original implementations of the authors, thanks for their excellent works!