/ethereum_analytical_db

Ethereum Analytical Database - Ethereum data access solution that can be used for analytics and application development. The solution works on a fast DB - Clickhouse.

Primary LanguageHTMLMIT LicenseMIT

cyber•Drop core

Installation

To build all nessesary containers (clickhouse, parity, grafana, core), use command:

docker-compose up

This will immediately start synchronization process

Maybe, you'll have to wait a bit while parity will get an actual info from Ethereum chain

Database state

Docker bundle contains grafana with dashboard. You can look at the state of database here.

Dashboard

Username: admin

Password: admin

Make sure you have 8123 and 3000 ports enabled

Examples

Usage examples of the crawlers are located in examples dir of this repo. The actual list of examples goes below:

Bug reports

Feel free to create an issue for the project, if you have a problem with installation. Please provide us the following info:

  • Your docker and docker-compose versions
  • The list of your modifications in containers
  • Actual state of the database (as a screenshot from grafana)
  • The log for unit tests:
docker-compose run core test

Advanced usage

Installation with vanilla docker

To build docker container, use command

docker build -t cyberdrop/core .

To install parity, use:

docker pull parity/parity:stable
docker run -p 8545:8545 parity/parity --jsonrpc-interface=all --tracing=on

To install clickhouse, use:

docker pull yandex/clickhouse-server:18.12.17
docker run yandex/clickhouse-server -p 9000:9000 -p 8123:8123 

You can see actual options for these containers in docker-compose.yml file

Make sure you've activated clickhouse and parity ports.

$ curl localhost:8545
Used HTTP Method is not allowed. POST or OPTIONS is required

$ curl localhost:9000
Port 9000 is for clickhouse-client program.
You must use port 8123 for HTTP.

Check the correctness of the installation using

docker run --network host cyberdrop/core test

You can run other operations the same way

Configuration

Configuration is located in config.py file. Please check this list before installation:

...

# URLs of parity APIs.
# You can specify block range for each URL to use different nodes for each request
PARITY_HOSTS = [...]

# Dictionary of table names in database.
# Meaning of each table explained in Schema
INDICES = {...}

# List of contract addresses to process in several operations.
# All other contracts will be skipped during certain operations
PROCESSED_CONTRACTS = [...]

# Size of pages received from Clickhouse
BATCH_SIZE = 1000 # recommended

# Number of chunks processed simultaneously during input parsing
INPUT_PARSING_PROCESSES = 10 # recommended

# Number of blocks processed simultaneously during events extraction
EVENTS_RANGE_SIZE = 10 # recommended

# API key for etherscan.io ABI extraction
ETHERSCAN_API_KEY = "..."

...

All operations

$ docker-compose run core --help

Usage: extractor.py [OPTIONS] COMMAND [ARGS]...

  Ethereum extractor

Options:
  --help  Show this message and exit.

Commands:
  prepare-database               Prepare all indices and views in database
  start                          Run partial synchronization of the database.
  start-full                     Run full synchronization of the database
  
  prepare-contracts-view         Prepare material view with contracts
  prepare-erc-transactions-view  Prepare material view with erc20
                                 transactions
  prepare-indices                Prepare tables in database
  extract-blocks                 Extract blocks with timestamp
  extract-events                 Extract events
  extract-traces                 Extract internal transactions
  extract-tokens                 Extract ERC20 token names, symbols, 
                                 total supply and etc.
  download-contracts-abi         Extract ABI description from etherscan.io
  download-prices                Download exchange rates
  parse-events-inputs            Start input parsing for events.
  parse-transactions-inputs      Start input parsing for transactions.
  
  test                           Run tests

Schema

Current data schema is going below:

Schema

Hardware requirements

Parity:

  • CPU: multi-core
  • RAM: 4 GB
  • Space: > 200 GB SSD

Clickhouse:

  • CPU: multi-core
  • RAM: 20 GB
  • Space: > 220 GB SSD

ETL:

  • CPU: multi-core
  • RAM: 4 GB

Tested on:

  • CPU: 6 cores (12 threads), 3.50 GHz
  • RAM: 256 GB
  • Space: 1 TB SSD