DGLD is an open-source library for Deep Graph Anomaly Detection based on pytorch and DGL. It provides unified interface of popular graph anomaly detection methods, including the data loader, data augmentation, model training and evaluation. Also, the widely used modules are well organized so that developers and researchers can quickly implement their own designed models.
- [Aug 2022] We have released an easy-to-use graphical command line tool for users to run experiments with different models, datasets and customized parameters. Users can select all the settings in the page, click 'Submit' and copy the shell scripts to the terminal.
- [July 2022] For PyG users, we recommend the PyGOD, which is another comprehensive package that also supports many graph anomaly detection methods.
- [June 2022] Recently we receive feedback that the reported results are slightly different from the original paper. This is due to the anomaly injection setting, the graph augmentation and sampling. We will provide more details on the settings.
Basic environment installation:
conda create -n dgld python=3.8.0
conda activate dgld
conda install cudatoolkit==11.3.1
pip install dgl-cu113==0.8.1 dglgo==0.0.1 -f https://data.dgl.ai/wheels/repo.html
pip install torch==1.11.0+cu113 -f https://download.pytorch.org/whl/torch_stable.html
Then clone the DGLD project, enter the directory and run:
git clone git@github.com:EagleLab-ZJU/DGLD.git
pip install -r requirements.txt
To check whether you have successfully installed the package and environment, you can simply run
python example.py
pip install DGLD
Now you can enjoy DGLD!
We support an example.py showing how it works. Here, we introduce how to simply run DGLD, following 4 steps.
DGLD support multiple data import methods, including PyTorch Geometric, DGL and custom data. DGLD combines the process of data load and anomaly injection. Except for some basic datasets(including "Cora", "Citeseer", "Pubmed", "BlogCatalog", "Flickr", "ogbn-arxiv" and "ACM"), DGLD also accept custom data.
In anomaly detection, DGLD inject the abnormal node in two methods, structural and contextual, by two parameters - p and k. gnd_dataset is an instance of GraphNodeAnomalyDectionDataset. g is an instance of DGL.Graph. label is an instnace of torch.Tensor, presenting the anomaly class. Following is an example showing that a few lines of codes are sufficient to load and inject.
from DGLD.utils.dataset import GraphNodeAnomalyDectionDataset
gnd_dataset = GraphNodeAnomalyDectionDataset("Cora", p = 15, k = 50)
g = gnd_dataset[0]
label = gnd_dataset.anomaly_label
DGLD supports some basic methods. It's easy to construct and train model.
from DGLD.models import CoLA
model = CoLA(in_feats = g.ndata['feat'].shape[1])
Function fit need parameters to specify number of epoch and device. For gpu, device should be a int, while a string 'cpu' for cpu.
from DGLD.utils.evaluation import split_auc
model.fit(g, num_epoch=5, device=0)
result = model.predict(g, auc_test_rounds=2)
print(split_auc(label, result))
The DGLD provides native graph anomaly detection datasets that widely used by existing methods.
Dataset | nodes | edges | attributes | anomalies |
---|---|---|---|---|
BlogCatalog | 5196 | 171743 | 8189 | 300 |
Flickr | 7575 | 239738 | 12047 | 450 |
ACM | 16484 | 71980 | 8337 | 600 |
Cora | 2708 | 5429 | 1433 | 150 |
Citeseer | 3327 | 4732 | 3703 | 150 |
Pubmed | 19717 | 44338 | 500 | 600 |
ogbn-arxiv | 169343 | 1166243 | 128 | 6000 |
Implemented Results (Sorted Results)
Method | Cora | Citeseer | Pubmed | BlogCatalog | Flickr | ACM | Arxiv |
---|---|---|---|---|---|---|---|
CoLA | 0.8823 | 0.8765 | 0.9632 | 0.6488 | 0.5790 | 0.8194 | 0.8833 |
SL-GAD | 0.8937 | 0.9003 | 0.9532 | 0.7782 | 0.7664 | 0.8146 | 0.7483 |
ANEMONE | 0.8916 | 0.8633 | 0.9630 | - | - | - | - |
DOMINANT | 0.8555 | 0.8236 | 0.8295 | 0.7795 | 0.7559 | 0.7067 | - |
ComGA | 0.9677 | 0.8020 | 0.9205 | 0.7908 | 0.7346 | 0.7147 | - |
AnomalyDAE | 0.9679 | 0.8832 | 0.9182 | 0.7666 | 0.7437 | 0.7091 | - |
ALARM | 0.9479 | 0.8318 | 0.8296 | 0.7718 | 0.7596 | 0.6952 | - |
AAGNN | 0.7371 | 0.7616 | 0.7442 | 0.7648 | 0.7388 | 0.4868 | - |
GUIDE | 0.9785 | 0.9778 | 0.9535 | 0.7675 | 0.7337 | 0.7153 | - |
CONAD | 0.9646 | 0.9116 | 0.9396 | 0.7863 | 0.7395 | 0.7005 | 0.6365 |
GAAN | 0.7964 | 0.7979 | 0.7862 | 0.7320 | 0.7510 | - | 0.8605 |
DONE | 0.9636 | 0.8948 | 0.8803 | 0.7842 | 0.7555 | 0.7094 | 0.7093 |
ONE | 0.9717 | 0.9900 | 0.8991 | 0.7924 | 0.7712 | 0.7072 | - |
AdONE | 0.9629 | 0.8935 | 0.9030 | 0.7438 | 0.7595 | - | 0.7651 |
GCNAE | 0.7707 | 0.7696 | 0.7941 | 0.7363 | 0.7529 | - | 0.7530 |
MLPAE | 0.7617 | 0.7538 | 0.7211 | 0.7399 | 0.7514 | - | 0.7382 |
SCAN | 0.6508 | 0.6671 | 0.7361 | 0.4926 | 0.6498 | - | 0.6905 |
- More Graph Anomaly Detection Methods
- Edge/Community/Graph Level Anomaly Detection Tasks
- Graphical Operation Interface