/dragonflybot

A vertically scalable stream processing framework focusing on low latency, helping you scale and consume financial data feeds.

Primary LanguageRustGNU General Public License v3.0GPL-3.0

DragonflyBot

A vertically scalable stream processing framework focusing on low latency, helping you scale and consume financial data feeds.

Design

The framework consists of 5 layers where each layer can be easily extended and can have its own builder.

Client/protocol layer

Here different clients (supporting different protocols) can be defined e.g. we can have a builder for WebSockets clients and a different builder for FIX clients.

Feed subscriber layer

Now that we have a connected client, each client/protocol in general require different subscription approaches e.g. a FIX connection might require different methods than a WebSocket connection. We can simply define the requirements or extend them on this layer.

Feed listener layer

Feed listeners are already subscribed to feeds (i.e. they require an active subscriber) and deal only with processing/responding to the data e.g. forward the message only if top of the order book has changed.

Feed listener aggregator layer

In practice, we connect to multiple feeds/venues. If we want to get a world view of all, we need to consume what the feed listeners are emitting in one place/worker. Doing this we can e.g. consume all the order book messages emitted (and filtered) from feed listeners and construct top best bid, best offer (BBO) of all order books (constructing BBO world view).

Service layer

The final layer - the place where you define your services which consume from feed listener aggregators and serve your subscribers e.g. a bot could subscribe to your service to get a BBO world view.

Allowed topologies

Since the layers are connected only with message passing queues, we can quickly change topologies from simple ones

graph TD;
    trading_bot-->gRPC_server;
    
    gRPC_server-->feed_listener_aggregator;
    feed_listener_aggregator-->feed_listener1;
    feed_listener_aggregator-->feed_listener2;
    feed_listener_aggregator-->feed_listener3;

    feed_listener1-->feed_subscriber1;
    feed_listener2-->feed_subscriber2;
    feed_listener3-->feed_subscriber3;

    feed_subscriber1-->client1/protocol1;
    feed_subscriber2-->client2/protocol2;
    feed_subscriber3-->client3/protocol1;

to more complex ones

graph TD;
    trading_bot-->gRPC_server;
    legacy_customer-->FIX_server;
    
    gRPC_server-->|stream BBO| feed_listener_aggregator1;
    FIX_server-->|stream BBO| feed_listener_aggregator1;
    FIX_server-->|stream aggregated liquidity| feed_listener_aggregator2;
    FIX_server-->|liquidate big order| execution_engine

    feed_listener_aggregator1-->feed_listener1;
    feed_listener_aggregator1-->feed_listener2;
    feed_listener_aggregator1-->feed_listener3;

    feed_listener_aggregator2-->feed_listener1;
    feed_listener_aggregator2-->feed_listener2;
    feed_listener_aggregator2-->feed_listener3;

    feed_listener1-->feed_subscriber1;
    feed_listener2-->feed_subscriber2;
    feed_listener3-->feed_subscriber3;

    feed_subscriber1-->client1/protocol1;
    feed_subscriber2-->client2/protocol2;
    feed_subscriber3-->client3/protocol1;

simply by changing the type of the queue (MPSC to e.g. SPMC) and introducing transformers if needed.

Performance considerations

WebSocket client

Based on WS benchmarks, websocket.rs provides the fastest WS client. However, TLS doesn't seem to be supported. The second fastest - fastwebsockets supports TLS and reading single frames. So if we're only interested in top of the book, we could only read a part of the whole message i.e. read only the frames needed to get top N BBO.

JSON document parsing

Since we're only interested in top level access (getting bids/asks), we can go with property based parsing libs which are up to 10x faster than libs which parse the whole JSON. Using property based parsing we also implement functionality of JSON stream parsers, parsing only the number of consecutive values needed i.e. for getting top 10 of the book, we parse only 10 values.

Alternative allocators

tikv-jemallocator is used for improving the performance of allocations.

Other opportunities

For small vectors SmallVec could be used.

Usage

To run an example where we aggregate order books and publish top 10 via a gRPC server:

# start the gRPC server in the background
cargo run --bin dragonflybot-grpc-server -- --instrument-name ethbtc&
 
# run the client
cargo run --bin dragonflybot-grpc-client

# Docker
DOCKER_BUILDKIT=1 docker build -t dragonflybot:latest .

# run the server in the background
docker run \
  --name dragonflybot_grpc_server \
  --rm \
  --user="$(id -u):$(id -u)" \
  --group-add="$(id -u)" \
  -p 127.0.0.1:50051:50051 \
  dragonflybot:latest \
  dragonflybot-grpc-server --instrument-name ethbtc &
 
# run the gRPC client
docker run \
  --name dragonflybot_grpc_client \
  --rm \
  --user="$(id -u):$(id -u)" \
  --group-add="$(id -u)" \
  --net="host" \
  dragonflybot:latest \
  dragonflybot-grpc-client
  
# to stop containers
docker stop dragonflybot_grpc_client \
  && docker stop dragonflybot_grpc_server

Developing

For developing a multi stage, multi branch Dockerfile, supporting Rust build cache via cargo-chef is available.

# run tests
cargo test

DOCKER_BUILDKIT=1 docker build --build-arg "BUILD_PROFILE=dev" -t dragonflybot_dev:latest .