This paper is accepted by VLDB 2022, and is published in Vol. 15, Issue 3.
xFraud is an explainable Fraud transaction prediction system. xFraud is composed of a predictor which learns expressive representations for malicious transaction detection from the heterogeneous transaction graph via a self-attentive heterogeneous graph neural network, and an explainer that generates meaningful and human understandable explanations from graphs to facilitate further process in business unit.
Setup the python environment with conda
and install pytorch
and its dependencies. Notice for the pytorch
related package, you may install the correct version to fit your cuda
device. In the following experiment scripts, I will use the cuda10.1
version.
bash ./scripts/install-env-publish.sh
If you are going to try the code of pyHGT
in this project, please run
bash ./scripts/add-submodules.sh
to include pyHGT
as submodules
This step is to prevent the OOM issue for loading large feature data.
LevelDB is used to store node features.
For this data sample, to generate the data store is not a necessary step, as we also provide the feature store
(./data/feat_store_publish.db
) used in the subsequent scripts.
bash ./scripts/setup-feature-store.py
bash ./scripts/run-detector.sh
Commonly used parameters are listed in the shell script.
python ./xfraud/run_explainer.py
python ./xfraud/explainer-eval-hitrate/ours.py
python ./xfraud/explainer-eval-hitrate/random-baseline.py
cd ./xfraud/supplement/06Centrality_measures/
python edge_betweenness.py --type-centrality 'edge_betweenness_centrality'
python line_graph_node_centrality.py --type-centrality 'degree'
cd ./xfraud/supplement/07Learning_hybrid/
# ridge
learn_equation-ridge.ipynb
# grid search with user defined A
python ours_learn-grid-A.py
# polynomial
learn_equation-polynomial.ipynb
# load the learned parameters and get the hybrid explainer weighs
python ours-learn.py
cd ./xfraud/supplement/08Vis/
plot_community_prettify-hybrid.ipynb
We provide a small sample of the transaction graph and features in ./data
.
We also provide the sample, its annotations, and evaluation results (in ./xfraud/explainer-eval-hitrate
) we describe in
the section about explainer.
All the datafiles are described in the scripts that utilize them.
The data we use in the paper is proprietary, i.e., real-world transaction records on the eBay platform. We can share our eBay-small dataset (desensitized transaction records) after signing a data sharing agreement provided by eBay and approved by legal team in eBay. The data can only be shared for a legitimate, non-commercial purpose. Please email "zitzhang@ebay.com" with title "xFraud data request" and provide your name, address, email address, and we will help you submit the request.
Copyright 2020-2021 eBay Inc.
Author/Developer: Susie Xi Rao, Shuai Zhang, Zhichao Han, Zitao Zhang, Wei Min, Zhiyao Chen, Yinan Shan, Yang Zhao, Ce Zhang
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the
License. You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
pyHGT
is from the original HGT implementation withMIT
license, see https://github.com/acbull/pyHGT/tree/master/pyHGT.
Other python package licenses are listed below:
Name Version License
dill 0.3.0 BSD License
fire 0.3.1 Apache Software License
networkx 2.4 BSD License
pandas 0.24.2 BSD
plyvel 1.3.0 BSD License
py4j 0.10.7 BSD License
pyarrow 0.15.1 Apache Software License
scikit-learn 0.22.2.post1 new BSD
scipy 1.4.1 BSD License
seaborn 0.9.0 BSD License
torch 1.5.0 BSD License
torch-cluster 1.5.7 MIT
torch-geometric 1.5.0 MIT
torch-scatter 2.0.5 MIT
torch-sparse 0.6.7 MIT
torch-spline-conv 1.2.0 MIT
torchvision 0.6.0a0+82fd1c8 BSD
tqdm 4.56.0 MIT License, Mozilla Public License 2.0 (MPL 2.0)
@article{rao2021xfraud,
title={xFraud: Explainable Fraud Transaction Detection},
author={Rao, Susie Xi and Zhang, Shuai and Han, Zhichao and Zhang, Zitao and Min, Wei and Chen, Zhiyao and Shan, Yinan and Zhao, Yang and Zhang, Ce},
journal={VLDB},
volumn={15},
Issue={3},
year={2022},
howpublished = {\url{https://github.com/eBay/xFraud}},
doi = {10.14778/3494124.3494128},
}