/nebula-dgl

NebulaGraph DGL(Deep Graph Library) Integration Package. (WIP)

Primary LanguagePythonApache License 2.0Apache-2.0

nebula-dgl

pdm-managed License

nebula-dgl is the Lib for NebulaGraph integration with Deep Graph Library (DGL).

nebula-dgl is still WIP, there is a demo project here .

Guide

Installation

Install from PyPi

python3 -m pip install nebula-dgl
python3 -m pip install dgl dglgo -f https://data.dgl.ai/wheels/repo.html

Install from codebase for dev

python3 -m pip install nebula3-python
python3 -m pip install dgl dglgo -f https://data.dgl.ai/wheels/repo.html

# build and install
python3 -m pip install .

Playground

Clone this repository to your local directory first.

git clone https://github.com/wey-gu/nebula-dgl.git
cd nebula-dgl
  1. Deploy NebulaGraph playground with Nebula-UP:

Install NebulaGraph:

curl -fsSL nebula-up.siwei.io/install.sh | bash

Load example data:

~/.nebula-up/load-basketballplayer-dataset.sh
  1. Create a jupyter notebook in same docker network: nebula-net
docker run -it --name dgl -p 8888:8888 --network nebula-net \
    -v "$PWD":/home/jovyan/work jupyter/datascience-notebook \
    start-notebook.sh --NotebookApp.token='secret'

Now you can either:

Or:

  • run ipython with the container:
docker exec -it dgl ipython
cd work
  1. Install nebula-dgl in notebook:

Install nebula-dgl:

!python3 -m pip install python3 -m pip install nebula3-python==3.4.0
!python3 -m pip install dgl dglgo -f https://data.dgl.ai/wheels/repo.html
!python3 -m pip install .
  1. Try with a homogeneous graph:
import yaml
import networkx as nx

from nebula_dgl import NebulaLoader


nebula_config = {
    "graph_hosts": [
                ('graphd', 9669),
                ('graphd1', 9669),
                ('graphd2', 9669)
            ],
    "nebula_user": "root",
    "nebula_password": "nebula",
}

# scan loader(mostly for training)

with open('example/homogeneous_graph.yaml', 'r') as f:
    feature_mapper = yaml.safe_load(f)

nebula_loader = NebulaLoader(nebula_config, feature_mapper)
homo_dgl_graph = nebula_loader.load()

# or query based(mostly for small graph when inference)

query = """
MATCH p=()-[:follow]->() RETURN p
"""
nebula_loader = NebulaLoader(nebula_config, feature_mapper, query=query, query_space="basketballplayer")
homo_dgl_graph = nebula_loader.load()

nx_g = homo_dgl_graph.to_networkx()
nx.draw(nx_g, with_labels=True, pos=nx.spring_layout(nx_g))

Result:

nx_draw

  1. Compute the degree centrality of the graph:
nx.degree_centrality(nx_g)

Result:

{0: 0.0,
 1: 0.04,
 2: 0.02,
 3: 0.02,
 4: 0.06,
 5: 0.06,
 6: 0.04,
 7: 0.24,
 8: 0.16,
 9: 0.0,
 10: 0.02,
 11: 0.04,
 12: 0.04,
 13: 0.04,
 14: 0.1,
 15: 0.04,
 16: 0.0,
 17: 0.1,
 18: 0.04,
 19: 0.04,
 20: 0.0,
 21: 0.0,
 22: 0.04,
 23: 0.02,
 24: 0.02,
 25: 0.04,
 26: 0.06,
 27: 0.0,
 28: 0.02,
 29: 0.0,
 30: 0.04,
 31: 0.12,
 32: 0.04,
 33: 0.22,
 34: 0.14,
 35: 0.1,
 36: 0.04,
 37: 0.14,
 38: 0.1,
 39: 0.02,
 40: 0.14,
 41: 0.08,
 42: 0.1,
 43: 0.12,
 44: 0.12,
 45: 0.08,
 46: 0.1,
 47: 0.02,
 48: 0.04,
 49: 0.12,
 50: 0.06}

NebulaGraph to DGL

from nebula_dgl import NebulaLoader


nebula_config = {
    "graph_hosts": [
                ('graphd', 9669),
                ('graphd1', 9669),
                ('graphd2', 9669)
            ],
    "nebula_user": "root",
    "nebula_password": "nebula"
}

# load feature_mapper from yaml file
with open('example/nebula_to_dgl_mapper.yaml', 'r') as f:
    feature_mapper = yaml.safe_load(f)

nebula_loader = NebulaLoader(nebula_config, feature_mapper)
dgl_graph = nebula_loader.load()

Play homogeneous graph algorithms in networkx

import networkx

with open('example/homogeneous_graph.yaml', 'r') as f:
    feature_mapper = yaml.safe_load(f)

nebula_loader = NebulaLoader(nebula_config, feature_mapper)
homo_dgl_graph = nebula_loader.load()
nx_g = homo_dgl_graph.to_networkx()

# plot it
networkx.draw(nx_g, with_lables=True)

# get degree
networkx.degree(nx_g)

# get degree centrality
networkx.degree_centrality(nx_g)

Multi-Part Loader for NebulaGraph

  1. For now, the Multi-Part Loader is slow like sequence scan, need to profile the performance.
import yaml
import networkx as nx
import matplotlib.pyplot as plt

from nebula_dgl import NebulaReducedLoader


nebula_config = {
    "graph_hosts": [
                ('127.0.0.1', 9669)
            ],
    "nebula_user": "root",
    "nebula_password": "nebula",
}

with open('example/homogeneous_graph.yaml', 'r') as f:
    feature_mapper = yaml.safe_load(f)

# you only need change the following line: from NebulaLoader to NebulaReducedLoader
# Easy for you to use the multi-part loader 
nebula_reduced_loader = NebulaReducedLoader(nebula_config, feature_mapper)
homo_dgl_graph = nebula_reduced_loader.load()
nx_g = homo_dgl_graph.to_networkx()
nx.draw(nx_g, with_labels=True, pos=nx.spring_layout(nx_g))
plt.savefig("multi_graph.png")