solana-snapshot-etl
efficiently extracts all accounts in a snapshot to load them into an external system.
Solana nodes periodically backup their account database into a .tar.zst
"snapshot" stream.
If you run a node yourself, you've probably seen a snapshot file such as this one already:
snapshot-139240745-D17vR2iksG5RoLMfTX7i5NwSsr4VpbybuX1eqzesQfu2.tar.zst
A full snapshot file contains a copy of all accounts at a specific slot state (in this case slot 139240745
).
Historical accounts data is relevant to blockchain analytics use-cases and event tracing. Despite archives being readily available, the ecosystem was missing an easy-to-use tool to access snapshot data.
cargo install --git https://github.com/kunalmodi/solana-snapshot-etl --features=standalone --bins
The ETL tool can extract snapshots from a variety of streaming sources and load them into one of the supported storage backends.
The basic command-line usage is as follows:
USAGE:
solana-snapshot-etl [OPTIONS] <LOAD_FLAGS> <SOURCE>
Extract from a local snapshot file:
solana-snapshot-etl /path/to/snapshot-*.tar.zst ...
Extract from an unpacked snapshot:
# Example unarchive command
tar -I zstd -xvf snapshot-*.tar.zst ./unpacked_snapshot/
solana-snapshot-etl ./unpacked_snapshot/
Stream snapshot from HTTP source or S3 bucket:
solana-snapshot-etl 'https://my-solana-node.bdnodes.net/snapshot.tar.zst?auth=xxx' ...
Dump all accounts and certain other tables to a Postgres database. Writes are batched, and occur in a configurable number of threads.
solana-snapshot-etl snapshot-139240745-*.tar.zst --postgres-out "user=solana password= host=localhost dbname=solana options='-c synchronous_commit=off'" --postgres-threads 16 --postgres-batch-size 500
The resulting SQLite database contains the following tables.
account
token_account
(SPL Token Program)token_mint
(SPL Token Program)token_metadata
(MPL Metadata Program)
The fastest way to access snapshot data is the SQLite3 load mechanism.
The resulting SQLite database file can be loaded using any SQLite client library.
solana-snapshot-etl snapshot-139240745-*.tar.zst --sqlite-out snapshot.db
The resulting SQLite database contains the following tables.
account
token_account
(SPL Token Program)token_mint
(SPL Token Program)token_multisig
(SPL Token Program)token_metadata
(MPL Metadata Program)
Coming soon!
Much like solana-validator
, this tool can write account updates to Geyser plugins.
solana-snapshot-etl snapshot-139240745-*.tar.zst --geyser plugin-config.json
For more info, consult Solana's docs: https://docs.solana.com/developing/plugins/geyser-plugins
The --programs-out
flag exports all Solana programs (in ELF format).
solana-snapshot-etl snapshot-139240745-*.tar.zst --programs-out programs.tar
or to extract in place
solana-snapshot-etl snapshot-139240745-*.tar.zst --programs-out - | tar -xv