Blockchain spiders aim to collect data of public chains, including:
- Transaction subgraph: the subgraph with a center of specific address
- Label data: the labels of address or transaction
- Block data: the blocks on chains
- ...
For more info in detail, see our documentation.
Let's start with the following command:
git clone https://github.com/wuzhy1ng/BlockchainSpider.git
And then install the dependencies:
pip install -r requirements.txt
We will demonstrate how to crawl a transaction subgraph of KuCoin hacker on Ethereum and trace the illegal fund of the hacker!
Run on this command as follow:
scrapy crawl txs.eth.ttr -a source=0xeb31973e0febf3e3d7058234a5ebbae1ab4b8c23
You can find the transaction data on ./data/0xeb3...c23.csv
on finished.
Try to import the transaction data and the importance of the addresses in the subgraph ./data/importance/0xeb3...c23.csv
to Gephi.
The hacker is related to Tornado Cash, a mixing server, it shows that the hacker took part in money laundering!
In this section, we will demonstrate how to collect labeled addresses in OFAC sanctions list!
Run this command as follow:
scrapy crawl labels.ofac
You can find the label data on ./data/labels.ofac
, each row of this file is a json object just like this:
{
"net":"ETH",
"label":"Entity",
"info":{
"uid":"30518",
"address":"0x72a5843cc08275C8171E582972Aa4fDa8C397B2A",
"first_name":null,
"last_name":"SECONDEYE SOLUTION",
"identities":[
{
"id_type":"Email Address",
"id_number":"support@secondeyesolution.com"
},
{
"id_type":"Email Address",
"id_number":"info@forwarderz.com"
}
]
}
}
Note: Please indicate the source when using crawling labels.
In this section, we will demonstrate how to collect block data in Ethereum!
Run this command as follow:
scrapy crawl trans.blocks.web3
You can find the label data on ./data
, in which:
BlockItem.csv
saves the metadata for blocks, such as minter, timestamp and so on.TransactionItem.csv
saves the external transactions of blocks.
If you want to get the best performance of Blockchainspider, please read the settings of APIKeys and Cache.
Please cite our paper (and the respective papers of the methods used) if you use this code in your own work:
@misc{wu2022tracer,
title={TRacer: Scalable Graph-based Transaction Tracing for Account-based Blockchain Trading Systems},
author={Zhiying Wu and Jieli Liu and Jiajing Wu and Zibin Zheng},
year={2022},
eprint={2201.05757},
archivePrefix={arXiv},
primaryClass={cs.CR}
}
Please execute the code in ./test
to reproduce the experimental results in the paper.
parameters.py
: Parameter sensitivity experiment.compare.py
: Comparative experiment.metrics.py
: Export evaluation metrics.
For more information, please refer to ./test/README.md