/ethereum-datafarm

Scrap blockchain data from the public API of Etherscan.io

Primary LanguagePythonApache License 2.0Apache-2.0

ethereum-datafarm v2.0

contributions welcome HitCount

Parse Smart Contract event data without requiring an archive/full node.

The ethereum-datafarm aims to provide quick access to historical Ethereum event data by offering an easy-to-use interface to parse event logs from contracts and save them in .csv format.

The ethereum-datafarm uses the Etherscan.io API, which can be used for free up to fairly generous limits.

Features:

  • Scraps every type of event data from pre-defined contracts
  • Fetches Abis from contracts to detect events
  • No local or Infura node required
  • Low CPU and RAM requirements
  • Multiprocessing support
  • Custom storage location using the -loc or --location flag: E.g. python3 run.py -loc ./myfolder

Example data output

image Or check out this sample output file of dai transfers

Usage

$ cd ./src
$ python3 run.py

OR

from ethereum_datafarm import *


if __name__=="__main__":
    
    # Initialize Farm
    farm = Farm()
    
    # Load Contracts
    farm.load_contracts()
    
    # Start parsing
    farm.farm()
NOTE: If the event-emitting contract is a proxie contract (e.g. upgradable contracts) then the abi detection may fail. In such cases, take the right abi from Etherscan and add the .abi file manually.
NOTE: If you have too many cores, you might reach the API limit (this will be logged). In such cases, use the -c or --cores flag to set the amount of cores to be used. A value of -c 4 is recommended.
NOTE: If you want to activate logging (useful for debugging), use the -log or --log flag. The logs are stored at "./logs.txt"

Install from source

$ git clone https://github.com/Nerolation/ethereum-datafarm
$ cd ethereum-datafarm
$ python3 -m venv .
$ source bin/activate
$ pip install -r requirements.txt

Requirements:

  • Python 3.5 or higher
  • Etherscan API key (for free at etherscan.io)
Make sure that contracts.csv has the following structure: (Contract address, custom name, canonical Event, start block, chunksize)
0x30f938fED5dE6e06a9A7Cd2Ac3517131C317B1E7,giveth,Donate(uint64,uint64,address,uint256),5876857,50000
0x30f938fED5dE6e06a9A7Cd2Ac3517131C317B1E7,giveth,DonateAndCreateGiver(address,uint64,address,uint256),5876857,50000
0xDe30da39c46104798bB5aA3fe8B9e0e1F348163F,gitcoin,Transfer(address,address,uint256),12422079,50000
0x1fd169A4f5c59ACf79d0Fd5d91D1201EF1Bce9f1,molochdao,SubmitVote(uint256,address,address,uint8),7218566,50000

Demo

Initialize farm and starts parsing data:

  • Loads contracts from contracts.csv file
  • Starts farm instance
  • Loops over contracts and saves data into .csv

    asciicast

Cite as

@misc{Wahrstaetter2022,
	title = {Ethereum-datafarm},
	url = {https://github.com/Nerolation/ethereum-datafarm},
	urldate = {2022-08-18},
	publisher = {Github},
	author = {Anton Wahrstätter},
	year = {2022},
}

Visit toniwahrstaetter.com for further details!

Anton Wahrstätter, 18.08.2022