[data] stream database updates to ETL
Opened this issue · 0 comments
sirouk commented
Streaming transactions
The goal is to be able to subscribe/query structured transaction data and produce for a data warehouse, GraphDB, etc.
Option 1:
Build a tool to subscribe to the available gRPC endpoint that can be made available from the indexer_grpc service that shipped with the node.
Option 2:
Create a new binary in rust that would use the state sync API nodes use, acting as a node and asking for transactions.
Progress:
I used the following repo as a reference. After poking around, I was able to configure a fullnode to respond to gRPC request for transactions:
https://github.com/aptos-labs/aptos-indexer-processors
Here's what should get you a working example, but this is not yet a method for streaming, just a starting point.
# prepare the fullnode to answer gRPC calls
# Add following to fullnode.yaml or vfn.yaml
indexer_grpc:
enabled: true
address: 0.0.0.0:50051
# restart node
# go lang install (manual)
sudo rm -Rf /usr/local/go
cd ~
wget https://go.dev/dl/go1.20.14.linux-amd64.tar.gz
tar -xvf go1.20.14.linux-amd64.tar.gz
sudo mv ~/go /usr/local/
# add to env
nano ~/.bashrc
# add the following
export GOROOT=/usr/local/go
export GOPATH=$HOME/go
export PATH=$PATH:/usr/local/go/bin
# soource the env
. ~/.bashrc
# install grpcurl
go install github.com/fullstorydev/grpcurl/cmd/grpcurl@latest
# clone the diem repo
rm -rf ~/diem
cd ~
git clone https://github.com/0LNetworkCommunity/diem
# within the diem repo
cd ~/diem
# fire a test grpc request
grpcurl -max-msg-sz 10000000 -d '{ "starting_version": 18437800 }' -import-path crates/diem-protos/proto -proto diem/internal/fullnode/v1/fullnode_data.proto -plaintext 127.0.0.1:50051 diem.internal.fullnode.v1.FullnodeData/GetTransactionsFromNode | jq
# produce!